Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A pkgdown website for data.table #3677

Merged
merged 62 commits into from
Sep 16, 2019
Merged

A pkgdown website for data.table #3677

merged 62 commits into from
Sep 16, 2019

Conversation

hadley
Copy link
Contributor

@hadley hadley commented Jul 3, 2019

jangorecki: WIP status in #3677 (comment)


You can see (a manual deploy) of the site at https://nervous-easley-6de120.netlify.com

Note that to build the pkgdown site yourself you'll need the latest version from GitHub (I had to fix a couple of bugs) and build with pkgdown::build_site(devel = FALSE, new_process = FALSE)

This is a very preliminary version presented only in order to have something concrete to respond to. To do:

  • If desired, pick a different theme from https://bootswatch.com
  • Produce a minimal image of the data.table syntax summary images
  • Polish overview paragraphs
  • Review structure of reference page
  • Prepare usage example

Note that I had to make a few minor changes to data.table sources to work cleanly with pkgdown. I think once you're happy with the basic website, we can discuss in more detail.

Fixes #3675

@codecov
Copy link

codecov bot commented Jul 3, 2019

Codecov Report

Merging #3677 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3677   +/-   ##
=======================================
  Coverage   99.42%   99.42%           
=======================================
  Files          72       72           
  Lines       13498    13498           
=======================================
  Hits        13421    13421           
  Misses         77       77

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1539ef6...7b361bf. Read the comment docs.

@MichaelChirico
Copy link
Member

Thanks so much @hadley!

Starting to look things over...

I'm afraid I don't follow you here:

Produce a minimal image of the data.table syntax summary images

Is that something on your end or for us?


I notice the mouse-over of 1.12.3 says Released version, but this is the dev version (1.12.2 is on CRAN; we use odd-even versions to distinguish), is that easy to change? Then, on the right, it says Dev status but we have a CRAN badge, so the two things are a bit at odds.


Am I right that the Changelog page looks for a foo() pattern and cross-references the package index to link to help pages? Would it be easy to match %foo% as well? In item 8 between() links but %between% doesn't.

between() and %between% are faster for POSIXct, #3519, and now support the .() alias, #2315. Thanks to @Henrik-P for the reports. There is now also support for bit64’s integer64 class and more robust coercion of types, #3517.

For the _pkgdown.yml config, we can use any of the \alias{}es to refer to a page right? And all of the non-\keyword{internal} pages must be indexed? Is it OK to comma-separate multiple of a pages aliases (e.g. I would switch IDateTime to IDate, ITime, IDateTime)? Is it OK to have duplicate references to a ref page? It might make sense to have IDate, ITime, IDateTime, then year, month, hour, day separately so that all the functions do get a callout on the reference page (even though they're all in the same Rd)


In general if a fix/change/upgrade is required in pkgdown, would be happy to try and help with the caveat that some primer about how things are working would be helpful.

@hadley
Copy link
Contributor Author

hadley commented Jul 4, 2019

To be precise, I think someone on the data table needs to redraw the minimal syntax image to remove the background colour, remove pdf download link in top right, and refine (or remove) the title.


Can you file an issue in pkgdown for this? Currently we assume that dev versions look like 1.0.0.9000, so we'll need to think about how to parameterise that differently.


I don't remember the exact details of how the autolinking works, but I think it should be straightforward to also autolink infix operators. Can you please file an issue?


There's currently very little control over how the aliases are display on the reference page. So far my position has been that if you think there are too many, then you should consider breaking up the topic into multiple pieces.


I don't have pkgdown front of mind at the moment, so I don't have any general advice, sorry 😞

@MichaelChirico
Copy link
Member

Re: minimal syntax image -- I had a go at reproducing that in text in my commit. Will leave it up to Matt how to proceed.

Re: issues: filed and filed

.gitlab-ci.yml Outdated
@@ -303,6 +305,10 @@ integration: # merging all artifacts to produce single R repository and summarie
- Rscript -e 'pdf.copy("data.table", "test-rel-lin")'
# web/checks/check_results_$pkg.html
- Rscript -e 'check.index("data.table", names(test.jobs))'
# pkgdown
- apt-get update -qq && apt-get install -y libxml2-dev
- Rscript -e 'install.packages("remotes", repos=Sys.getenv("CRAN_MIRROR"), quiet=TRUE); remotes::install_github("r-lib/pkgdown", repos=Sys.getenv("CRAN_MIRROR"), quiet=TRUE); install.packages("data.table", repos=file.path("file:",normalizePath("bus/integration/cran")), quiet=TRUE); pkgdown::build_site(devel=FALSE, new_process=TRUE, override=list(destination="./pkgdown"))'
Copy link
Member

@jangorecki jangorecki Aug 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to speed it up, currently it installs remotes, pkgdown and 59! other packages. Plus it requires OS dependency libxml2-dev. As a result this job takes 16 minutes while previously took 5 minutes.
This would help: r-lib/pkgdown#1115

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does gitlab not have caching? So you have to install these packages on every run?

Copy link
Member

@jangorecki jangorecki Aug 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitLab caching is currently in use when mirror suggested dependencies. We do publish all of them together with devel data.table in our R repository, to achieve ultimate reproducibility.
As of now we do not need to use any other package in the pipeline, all CI is made in base R. Thus adding 60 packages related to CI only is pretty big thing. I will try to make a secondary CI-related cache/repo.

Just for reference, mirror-packages job:

mirror-packages: # download all recursive dependencies of data.table suggests and integration suggests from inst/tests/tests-DESCRIPTION
stage: dependencies
tags:
- linux
image: registry.gitlab.com/jangorecki/dockerfiles/r-base-dev
cache:
paths:
- bus/$CI_BUILD_NAME/cran
variables:
R_BIN_VERSION: "3.6"
R_DEVEL_BIN_VERSION: "3.7"
script:
- echo 'source("ci.R")' >> .Rprofile
- mkdir -p bus/$CI_BUILD_NAME/cran/src/contrib
# mirror R dependencies: source, win.binary
- Rscript -e 'mirror.packages(dcf.dependencies(c("DESCRIPTION","inst/tests/tests-DESCRIPTION"), "all"), repos=c(Sys.getenv("CRAN_MIRROR"), dcf.repos("inst/tests/tests-DESCRIPTION")), repodir="bus/mirror-packages/cran")'
- Rscript -e 'sapply(simplify=FALSE, setNames(nm=Sys.getenv(c("R_BIN_VERSION","R_DEVEL_BIN_VERSION"))), function(binary.ver) mirror.packages(type="win.binary", dcf.dependencies("DESCRIPTION", "all"), repos=Sys.getenv("CRAN_MIRROR"), repodir="bus/mirror-packages/cran", binary.ver=binary.ver))'
<<: *artifacts

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after recent commits to pkgdown we went down to 13 minutes for this job
so 3/11 of the overhead is fixed

@jangorecki
Copy link
Member

Once we agree it is ready to merge I will then update url references from jangorecki.gitlab.io to Rdatatable.gitlab.io and restrict build to master branch only (during dev it also publish for pkgdown).

@MichaelChirico
Copy link
Member

Changelog

  • It's quite nice to have the internal links to old versions. We also have NEWS.0.md, is there any way to reference that? It so happens that we did link it at the bottom but that's pretty obscure. It might be nice to have the a toggle at the top, or for the last link on the right to redirect to that file on GH? @hadley this might be a pkgdown FR?

  • Section headers are also inconsistent @hadley:

Screenshot 2019-08-30 at 10 00 43 AM
Screenshot 2019-08-30 at 10 00 48 AM

For some reason the date as parsed shows up on the left only for v1.12.2?

  • I see this, feels a bit redundant, anything to be done? (in development) Unreleased

  • Is there a way to add the dates to the links on the right as well?

Manual

  • nafill() setnafill() appear twice, is that intentional? If so I would also add copy() to the in-place modification section (and generally double-reference anything that feels kind-of borderline)

Vignettes

  • in Intro vignette, in the next vignette at the end -- not totally clear what this is referring to

@jangorecki
Copy link
Member

jangorecki commented Aug 30, 2019

It's quite nice to have the internal links to old versions. We also have NEWS.0.md, is there any way to reference that? It so happens that we did link it at the bottom but that's pretty obscure. It might be nice to have the a toggle at the top, or for the last link on the right to redirect to that file on GH? @hadley this might be a pkgdown FR?

This is already possible, we can make a menu of changelog files, but I don't see it that much important.
NEWS.0.md is more for historical, not much practical, reasons.

date

related issue: r-lib/pkgdown#1118
maybe worth to put your comments there

next vignette

we should explicitly name it, or if possible put a portable link

double referencing

I don't think it is an issue to double reference. What would be useful is to put only setnafill in one group and nafill in another. But pkgdown seems to put all related links from same manual file in the same line.

DESCRIPTION Outdated
@@ -49,7 +49,7 @@ Suggests: bit64, curl, R.utils, knitr, xts, nanotime, zoo, yaml
SystemRequirements: zlib
Description: Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.
License: MPL-2.0 | file LICENSE
URL: http://r-datatable.com
URL: https://github.com/Rdatatable/data.table
Copy link
Member

@mattdowle mattdowle Sep 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jangorecki why this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not precisely sure, https://pkgdown.r-lib.org/articles/linking.html vignette mention to use github repo url. I reverted back to use two urls, as initially commited by Hadley

Copy link
Member

@jangorecki jangorecki Sep 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure but we might need to use three urls there, as pkgdown website will be published under: https://Rdatatable.gitlab.io/data.table
now during dev: https://jangorecki.gitlab.io/data.table/

@mattdowle mattdowle merged commit 6855ee7 into Rdatatable:master Sep 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

data.table homepage
4 participants