Skip to content
Snippets Groups Projects
Commit 6d3e45c6 authored by jennybc's avatar jennybc
Browse files

make a md table for data-raw README

parent c7b2b385
No related branches found
No related tags found
No related merge requests found
......@@ -25,30 +25,36 @@ suppressPackageStartupMessages(library(dplyr))
library(stringr)
suppressPackageStartupMessages(library(purrr))
library(tidyr)
library(knitr)
f <- list.files()
x <- f %>%
str_split_fixed("_", n = 2) %>%
str_split_fixed("[_\\.]", n = 3) %>%
tbl_df() %>%
set_names(c("script","leftovers")) %>%
set_names(c("script", "slug", "ext")) %>%
bind_cols(data_frame(f))
x <- x %>%
filter(script %>% str_detect("^[0-9]+")) %>%
mutate(ext = leftovers %>% tolower() %>% tools::file_ext()) %>%
filter(ext %>% str_detect("r|md|tsv"))
y <- x %>% nest(- script)
mdlink <- function(x) paste0("[", x, "](", x, ")")
filter(script %>% str_detect("^[0-9]+"),
ext %>% str_detect("R|r|md|tsv")) %>%
select(-slug)
y <- x %>%
nest(-script)
collapse_md_links <- function(x) {
x %>% {
paste0("[", ., "](", ., ")")
} %>%
paste(collapse = ", ")
}
jfun <- function(z) {
r_script <- z$f[z$ext == "r"] %>% mdlink()
notebook <- z$f[z$ext == "md"] %>% mdlink()
tsv <- z$f[z$ext == "tsv"] %>% mdlink() %>% paste(collapse = " ")
text <- paste(r_script, notebook, tsv, sep = " | ")
text
data_frame(r_script = z$f[z$ext == "R"] %>% collapse_md_links(),
notebook = z$f[z$ext == "md"] %>% collapse_md_links(),
tsv = z$f[z$ext == "tsv"] %>% collapse_md_links())
}
y$data %>%
map_chr(jfun) %>%
paste(collapse = "\n\n") %>%
cat()
map_df(jfun) %>%
kable()
```
```{r eval = FALSE, echo = FALSE}
......
......@@ -6,22 +6,15 @@ Cleaning history
- 2014: I re-cleaned the data and (mostly) forced myself to pull it straight out of the spreadsheets. Used the `gdata` package. It was kind of painful, due to encoding and other issues. See the scripts in this state in [v0.1.0](https://github.com/jennybc/gapminder/tree/v0.1.0/data-raw).
- 2015: I revisited the cleaning and switched to `readxl`. This was much less painful. Present day.
[01\_extract-from-excel-pop.R](01_extract-from-excel-pop.R) | [01\_extract-from-excel-pop.md](01_extract-from-excel-pop.md) | [01\_pop.tsv](01_pop.tsv)
[02\_extract-from-excel-lifeExp.R](02_extract-from-excel-lifeExp.R) | [02\_extract-from-excel-lifeExp.md](02_extract-from-excel-lifeExp.md) | [02\_lifeExp.tsv](02_lifeExp.tsv)
[03\_extract-from-excel-gdpPercap.R](03_extract-from-excel-gdpPercap.R) | [03\_extract-from-excel-gdpPercap.md](03_extract-from-excel-gdpPercap.md) | [03\_gdpPercap.tsv](03_gdpPercap.tsv)
[04\_merge-pop-lifeExp-gdpPercap.R](04_merge-pop-lifeExp-gdpPercap.R) | [04\_merge-pop-lifeExp-gdpPercap.md](04_merge-pop-lifeExp-gdpPercap.md) | [04\_gap-merged.tsv](04_gap-merged.tsv)
[05\_impute-china-1952-gdpPercap.R](05_impute-china-1952-gdpPercap.R) | [05\_impute-china-1952-gdpPercap.md](05_impute-china-1952-gdpPercap.md) | [05\_gap-merged-with-china-1952.tsv](05_gap-merged-with-china-1952.tsv)
[06\_smell-test-gap-merged.R](06_smell-test-gap-merged.R) | [06\_smell-test-gap-merged.md](06_smell-test-gap-merged.md) | []()
[07\_fill-and-fix-continent.R](07_fill-and-fix-continent.R) | [07\_fill-and-fix-continent.md](07_fill-and-fix-continent.md) | [07\_gap-merged-with-continent.tsv](07_gap-merged-with-continent.tsv)
[08\_filter-every-five-years.R](08_filter-every-five-years.R) | [08\_filter-every-five-years.md](08_filter-every-five-years.md) | [08\_gap-every-five-years.tsv](08_gap-every-five-years.tsv)
[09\_add-data-to-package.R](09_add-data-to-package.R) | [09\_add-data-to-package.md](09_add-data-to-package.md) | []()
[40\_make-color-scheme.R](40_make-color-scheme.R) | [40\_make-color-scheme.md](40_make-color-scheme.md) | [40\_continent-colors.tsv](40_continent-colors.tsv) [40\_country-colors.tsv](40_country-colors.tsv)
| r\_script | notebook | tsv |
|:------------------------------------------------------------------------|:--------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------|
| [01\_extract-from-excel-pop.R](01_extract-from-excel-pop.R) | [01\_extract-from-excel-pop.md](01_extract-from-excel-pop.md) | [01\_pop.tsv](01_pop.tsv) |
| [02\_extract-from-excel-lifeExp.R](02_extract-from-excel-lifeExp.R) | [02\_extract-from-excel-lifeExp.md](02_extract-from-excel-lifeExp.md) | [02\_lifeExp.tsv](02_lifeExp.tsv) |
| [03\_extract-from-excel-gdpPercap.R](03_extract-from-excel-gdpPercap.R) | [03\_extract-from-excel-gdpPercap.md](03_extract-from-excel-gdpPercap.md) | [03\_gdpPercap.tsv](03_gdpPercap.tsv) |
| [04\_merge-pop-lifeExp-gdpPercap.R](04_merge-pop-lifeExp-gdpPercap.R) | [04\_merge-pop-lifeExp-gdpPercap.md](04_merge-pop-lifeExp-gdpPercap.md) | [04\_gap-merged.tsv](04_gap-merged.tsv) |
| [05\_impute-china-1952-gdpPercap.R](05_impute-china-1952-gdpPercap.R) | [05\_impute-china-1952-gdpPercap.md](05_impute-china-1952-gdpPercap.md) | [05\_gap-merged-with-china-1952.tsv](05_gap-merged-with-china-1952.tsv) |
| [06\_smell-test-gap-merged.R](06_smell-test-gap-merged.R) | [06\_smell-test-gap-merged.md](06_smell-test-gap-merged.md) | []() |
| [07\_fill-and-fix-continent.R](07_fill-and-fix-continent.R) | [07\_fill-and-fix-continent.md](07_fill-and-fix-continent.md) | [07\_gap-merged-with-continent.tsv](07_gap-merged-with-continent.tsv) |
| [08\_filter-every-five-years.R](08_filter-every-five-years.R) | [08\_filter-every-five-years.md](08_filter-every-five-years.md) | [08\_gap-every-five-years.tsv](08_gap-every-five-years.tsv) |
| [09\_add-data-to-package.R](09_add-data-to-package.R) | [09\_add-data-to-package.md](09_add-data-to-package.md) | []() |
| [40\_make-color-scheme.R](40_make-color-scheme.R) | [40\_make-color-scheme.md](40_make-color-scheme.md) | [40\_continent-colors.tsv](40_continent-colors.tsv), [40\_country-colors.tsv](40_country-colors.tsv) |
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment