Show simple item record

dc.contributor.authorCoipan, Claudia E
dc.contributor.authorDallman, Timothy J
dc.contributor.authorBrown, Derek
dc.contributor.authorHartman, Hassan
dc.contributor.authorvan der Voort, Menno
dc.contributor.authorvan den Berg, Redmar R
dc.contributor.authorPalm, Daniel
dc.contributor.authorKotila, Saara
dc.contributor.authorvan Wijk, Tom
dc.contributor.authorFranz, Eelco
dc.date.accessioned2020-07-11T18:48:37Z
dc.date.available2020-07-11T18:48:37Z
dc.date.issued2020-01-01
dc.identifier.issn2057-5858
dc.identifier.pmid32101514
dc.identifier.doi10.1099/mgen.0.000318
dc.identifier.urihttp://hdl.handle.net/10029/623924
dc.description.abstractA large European multi-country Salmonella enterica serovar Enteritidis outbreak associated with Polish eggs was characterized by whole-genome sequencing (WGS)-based analysis, with various European institutes using different analysis workflows to identify isolates potentially related to the outbreak. The objective of our study was to compare the output of six of these different typing workflows (distance matrices of either SNP-based or allele-based workflows) in terms of cluster detection and concordance. To this end, we analysed a set of 180 isolates coming from confirmed and probable outbreak cases, which were representative of the genetic variation within the outbreak, supplemented with 22 unrelated contemporaneous S. enterica serovar Enteritidis isolates. Since the definition of a cluster cut-off based on genetic distance requires prior knowledge on the evolutionary processes that govern the bacterial populations in question, we used a variety of hierarchical clustering methods (single, average and complete) and selected the optimal number of clusters based on the consensus of the silhouette, Dunn2, and McClain-Rao internal validation indices. External validation was done by calculating the concordance with the WGS-based case definition (SNP-address) for this outbreak using the Fowlkes-Mallows index. Our analysis indicates that with complete-linkage hierarchical clustering combined with the optimal number of clusters, as defined by three internal validity indices, the six different allele- and SNP-based typing workflows generate clusters with similar compositions. Furthermore, we show that even in the absence of coordinated typing procedures, but by using an unsupervised machine learning methodology for cluster delineation, the various workflows that are currently in use by six European public-health authorities can identify concordant clusters of genetically related S. enterica serovar Enteritidis isolates; thus, providing public-health researchers with comparable tools for detection of infectious-disease outbreaks.en_US
dc.language.isoenen_US
dc.subjectepidemiologyen_US
dc.subjecthierarchical clusteringen_US
dc.subjectinfectious diseaseen_US
dc.subjectsurveillanceen_US
dc.subjectunsupervised machine learningen_US
dc.subjectwhole-genome sequencingen_US
dc.titleConcordance of SNP- and allele-based typing workflows in the context of a large-scale international Enteritidis outbreak investigation.en_US
dc.typeArticleen_US
dc.identifier.journalMicrob Genom 2020; 6(3):e000318en_US
dc.source.journaltitleMicrobial genomics


This item appears in the following Collection(s)

Show simple item record