Rust’s place with some previous and current Shiny projects

Column

MicrobialLandScape

The MicrobialLandScape application allows user to upload bacterial abundance files (Bacterial 16s rDNA amplicons results) and run prediction of soil physchem parameters and geographical location. Models were trained on AMI soil 16S rDNA datasets and still under development. Metadata for the models used from public domains AMI, NVIS and ALUM8. Models are still under development: the accuracy and the performance has to be yet evaluated. There is inbuild dataset for demonstration purposes.

Information below contains some additional details about the training dataset used in application.

UMAP for Australian Microbiome contextual data

Select samples by using rectangular tool from the widget on top right dash AMI map. Based on your selection over the geographical location those samples will be highlighted on UMAP figure below. Both graphs are interactive and allow you to perform various data filtering and navigation.

UMAP embedding for OTUs counts.

Dimensionality reduction performed on the bacterial profiles which were taken from BPA website: UMAP function was applied to the transformed OTUs count matrix. OTUs counts with matched “unklnown” taxonomies were added. In other words different sequences with unknown taxonomies were treated as identical. Selected metadata columns were collected from the main contextual mastersheet table and merged with UMAP results.

Australian Microbiome contextual data.

Contextual metadata taken from BPA website

Number of soil samples by locality.
locality	nn
Christmas Island	16
Jervis Bay Territory	42
Sydney	48
Monaro	92
Australian Capital Territory	116
Antarctica: Windmill Islands	162
Tasmania	237
New South Wales	270
Antarctica	330
Queensland	347
Victoria	375
South Australia	406
Northern Territory	457
Western Australia	592

Major Vegetation Groups NVIS layer

The “Version 6.0 Major Vegetation Groups – Estimated Pre-1750 Vegetation” (NVIS 2023) refers to a classification system used to categorize and describe the major types of native vegetation that were estimated to have existed in Australia before European colonization and land use changes.The Major Vegetation Groups are a simplified representation of Australia’s native vegetation. They abstract away the details of individual plant species and focus on the overall patterns of vegetation distribution. They encompass various combinations of plant species within the canopy, shrub, or ground layers, displaying structural similarities and often being dominated by a single genus. From a mapping perspective, these groups represent the predominant vegetation found within a map unit, even in cases where several vegetation types coexist.

R Library leaflet was used to generate this plot. Select samples

CRC TiME

CRC – Cooperative Research Centres program
TiME – Transformation in Mining Economies

This project is dedicated to establish link between soil microbes community and plant vegetation with possibility to extend analysis to the application for seed provisioning selection. The shiny application allows to do exploratory analysis of microbial dataset where samples profiles can be plotted on scatter plot and highlighted with metadata values. UMAP techniques used for dimensional reduction. The differentiall abundance analysis is performed with DESEQ2 package where only pairwise contrast can be applied for heatmap plots.

To do:

Sample selection for inclusion exclusion for the UMAP embedding.
Better control fo heatmap figure:
- number of expected clusters (k-means)
- number of rows to show,
- number of samples,
- Test Interactive complexheatmap
Convert nonnumerical columns to factor column for coloring UMAP.

FINDMYSNP

Application findmysnp is designed for quick navigation trough illumina microarray annotations. It was originally though for genotyping beadchip arrays only but later annotation from methylation EPICv2 array added. The idea is to browse and compare arrays for the best gene and snp content. Note that EPICv2 array has huge amount of SNPs in report but those are not directly queering by EPIC methylation assay. Those reported SNPs are merely in close vicinity to the cpg methylation probe/site on the EPIC array. Another moment to mention here, is the gene annotation: it may not 100% match the illumina product files: there is was some extra work performed on annottion files to make them harmonized across chips. This is mostly about matching rs-id to associated genes. Some illumina files may have missed gene symbols associated with particular rs-ids. To make rs-id annotations uniform all missed genes for rs were populated based on entire chip collection: in other words if at least in one chip rs-id has been annotated with gene symbol it been propagated across all chips. Two genome versions were added not the hg38 coordinates were created by liftovers from hg19.

The application is reasonably matured. There probably few extra features would be good to add: 1) Region based search on chromosomes and coordinates input. 2) Input format flexibility: comma separated and end of line and space sparated lists. 3) Interface to include exclude particular chips from analysis. 4) New illumina methylation chip.

Older version code and documentation available on GitHUB

Docker container with shiny application and R libraries. Docker

Note this one only has GTA and EPIC chip included.

misc

Some visitor counter app for the shiny logs.

Inspired by https://www.rcharlie.com/blog/shiny-monitor/.

Work in progress . . .

This is an alt text.
https://markdownlivepreview.com/