Chunking — chunk • mrgmisc

Create chunks by group, unique values, and return as a vector or a list with elements.

chunk(.x, .nchunk = parallel::detectCores())

ids_per_plot(id, id_per_plot = 9)

chunk_grp(.x, .nchunk = parallel::detectCores())

chunk_list(.x, .nchunk = parallel::detectCores())

chunk_grp_list(.x, .nchunk = parallel::detectCores())

Arguments

.x: vector of values
.nchunk: number of chunks to identify
id: vector of ids (eg id column)
id_per_plot: number of ids per plot. Default to 9

Functions

ids_per_plot(): split IDs into groups to use for subsequent plotting works very well with hadley wickham's purrr package to create a column to split on then subsequently plot, see vignette("Multiplot") for details
chunk_grp(): used when desirable to have unique elements in same chunk
chunk_list(): used when desirable to have output be a list
chunk_grp_list(): used when desirable to have output be a list with unique elements in the same chunk

Examples

# Chunking will provide the chunk index by splitting the data 
# as evenly as possible into the number chunks specified

chunk(letters[1:9], 3) 
#> [1] 1 1 1 2 2 2 3 3 3

# When the vector length isn't divisible by the number of chunks, 
# some chunks will be filled with 1 more element than others. 

chunk(letters[c(1, 1, 2, 1:7)], 3)
#>  [1] 1 1 1 1 2 2 2 3 3 3

# If interested in evenly chunking by unique values rather than balancing,
# notice how the first chunk contains many more elements since there are 3, 1's.

chunk_grp(c(1, 1, 1:7), 3)
#> [1] 1 1 1 1 1 2 2 3 3

# A potential next step after chunking is splitting the output into a list.

chunk_list(letters[1:9], 3)
#> [[1]]
#> [1] "a" "b" "c"
#> 
#> [[2]]
#> [1] "d" "e" "f"
#> 
#> [[3]]
#> [1] "g" "h" "i"
#> 

# If we want to keep unique elements consistent as possible between chunks

chunk_grp_list(c(letters[1], letters[1], letters[1:7]), 3)
#> [[1]]
#> [1] "a" "a" "a" "b" "c"
#> 
#> [[2]]
#> [1] "d" "e"
#> 
#> [[3]]
#> [1] "f" "g"
#>