Return a dataframe containing the paths of files in a github repostiory.
Generally used prior to spot_{funs/pkgs}_files().
Usage
list_files_github_repo(
repo,
branch = NULL,
pattern = stringr::regex("(r|rmd|rmarkdown|qmd)$", ignore_case = TRUE),
rmv_index = TRUE
)Arguments
- repo
Github repository, e.g. "brshallo/feat-eng-lags-presentation"
- branch
Branch of github repository, default is "main".
- pattern
Regex pattern to keep only matching files. Default is
stringr::regex("(r|rmd|rmarkdown|qmd)$", ignore_case = TRUE)which will keep only R, Rmarkdown and Quarto documents. To keep all files use".".- rmv_index
Logical, most repos containing blogdown sites will have an index.R file at the root. Change to
FALSEif you don't want this file removed.
Value
Dataframe with columns of relative_paths and absolute_paths for
file path locations. absolute_paths will be urls to raw files.
Examples
# \donttest{
library(dplyr)
library(funspotr)
# pulling and analyzing my R file github gists
gh_urls <- list_files_github_repo("brshallo/feat-eng-lags-presentation", branch = "main")
# Will just parse the first 2 files/gists
contents <- spot_funs_files(slice(gh_urls, 1:2))
contents %>%
unnest_results()
#> # A tibble: 75 × 4
#> funs pkgs relative_paths absolute_paths
#> <chr> <chr> <chr> <chr>
#> 1 purl knitr R/Rmd-to-R.R https://raw.githubuserco…
#> 2 here here R/Rmd-to-R.R https://raw.githubuserco…
#> 3 getOption base R/feat-engineering-lags.R https://raw.githubuserco…
#> 4 options base R/feat-engineering-lags.R https://raw.githubuserco…
#> 5 library base R/feat-engineering-lags.R https://raw.githubuserco…
#> 6 read_csv (unknown) R/feat-engineering-lags.R https://raw.githubuserco…
#> 7 arrange (unknown) R/feat-engineering-lags.R https://raw.githubuserco…
#> 8 mutate (unknown) R/feat-engineering-lags.R https://raw.githubuserco…
#> 9 slide_index_dbl (unknown) R/feat-engineering-lags.R https://raw.githubuserco…
#> 10 days (unknown) R/feat-engineering-lags.R https://raw.githubuserco…
#> # ℹ 65 more rows
# }
