Skip to contents

Measure Package Similarity

Usage

get_similar_packages(pkg, pkg_list = NULL)

Arguments

pkg

A single package name.

pkg_list

Value of a call to biocPkgList(). If NULL (default), will call biocPkgList() internally. See Details.

Value

A tibble of two columns: package and similarity. package is the name of every other package. similarity is a measure of similarity (see Details).

Details

Calling BiocPkgTools::biocPkgList() and passing the result to get_packages_by_view() or get_packages_by_views() is more efficient if you are making multiple calls. See vignette 'Optimisations' for a more comprehensive discussion and demonstration.

Currently, similarity is quantified by taking the Hamming distance over the set of biocViews used to tag at least one package, dividing by the number of biocViews to normalise within the range [0, 1], and finally taking one minus that value to yield a measure of similarity in the range [0, 1].

Examples

get_similar_packages("edgeR")
#> 'getOption("repos")' replaces Bioconductor standard repositories, see
#> 'help("repositories", package = "BiocManager")' for details.
#> Replacement repositories:
#>     CRAN: https://p3m.dev/cran/__linux__/noble/latest
#> # A tibble: 2,352 × 2
#>    package           similarity
#>    <chr>                  <dbl>
#>  1 metaseqR2              0.908
#>  2 dearseq                0.883
#>  3 limma                  0.883
#>  4 dreamlet               0.877
#>  5 roastgsa               0.877
#>  6 crumblr                0.871
#>  7 variancePartition      0.871
#>  8 zenith                 0.871
#>  9 BASiCS                 0.865
#> 10 BASiCStan              0.865
#> # ℹ 2,342 more rows