Baitset Optimisation for Population-Level Studies of Sponge Hosts (MSc Thesis)

Sponges are notoriously difficult to work with from both a taxonomic and phylogenomic perspective. Their simple bauplan, characterised by relatively few diagnostic morphological features, makes species identification challenging and has resulted in a taxonomically complex group with ongoing systematic revisions. Traditional molecular approaches, such as single-locus barcoding, often lack the resolution needed to inform population structure or to confidently delimit species boundaries in sponges. However, target capture enrichment offers a solution to these challenges. By selectively enriching hundreds to thousands of informative genomic loci, this approach generates high-quality population genomic data without the costs and computational demands of whole-genome sequencing. An existing custom baitset targeting exon loci across genomes has recently been developed for sponges, providing a foundation for population genomics of our group of interest. In this context, we aim to investigate the population structure and evolutionary history of Petrosia sponges that host the branching worm Ramisyllis and associated barnacles across multiple localities in Japan’s waters. However, before launching large-scale population genomic surveys, an optimised baitset specified for the Petrosia species is needed. By taking this approach, locus capture efficiency can be optimised and higher genomic coverage ensured, which is crucial for more reliable population-level analyses.


Project description

Your task is to optimise an existing Haplosclerida baitset for genus Petrosia, with special attention to those involved in the Ramisyllis-symbiotic network. Using available genomic, transcriptomic, and target-capture datasets, you will identify the most informative loci for this group, and test the performance of the refined baitset in silico. You will evaluate factors such as locus coverage, phylogenomic informativeness, and expected capture efficiency across samples from diverse localities. If the optimised baitset demonstrates good performance and captures sufficient informative loci, it will be synthesised and deployed for a large-scale population genomic study. Hence, your work will directly contribute to investigating host population structure, connectivity, and potential co-evolutionary patterns of the sponges and their symbionts.


Main questions:

(I) Which loci from the existing probe set perform optimally for the Petrosia species of interest?
(II) Are there genus-specific or clade-specific loci that should be added to maximise phylogenetic resolution within Petrosia?
(III) How does the optimised baitset perform in silico across available genomic and transcriptomic datasets?
(IV) Is the baitset sufficiently robust for population-level analyses and can it identify genetic structure?