Genomic features of Mycobacterium avium subsp. hominissuis isolated from pigs in Japan

Mycobacterium avium subsp. hominissuis (MAH) is one of the most important agents causing non-tuberculosis mycobacterial infection in humans and pigs. There have been advances in genome analysis of MAH from human isolates, but studies of isolates from pigs are limited despite its potential source of infection to human. Here, we obtained 30 draft genome sequences of MAH from pigs reared in Japan. The 30 draft genomes were 4,848,678–5,620,788 bp in length, comprising 4652–5388 coding genes and 46–75 (median: 47) tRNAs. All isolates had restriction modification-associated genes and 185–222 predicted virulence genes. Two isolates had tRNA arrays and one isolate had a clustered regularly interspaced short palindromic repeat (CRISPR) region. Our results will be useful for evaluation of the ecology of MAH by providing a foundation for genome-based epidemiological studies.

region. Our results will be useful for evaluation of the ecology of MAH by providing a foundation for genome-based epidemiological studies.
Recently, there has been extensive progression in the genomic epidemiological study of MAH. Based on findings from our recent studies, MAH is divided into six major lineages: MahEastAsia1, MahEastAsia2, and SC1-4. Each lineage is predominant in specific regions on a global scale [11,12]. For example, the MahEastAsia1 and MahEastAsia2 lineages are frequently isolated from human lung disease in Japan and Korea, but SC1-4 lineages are isolated from America and Europe [11,12]. Japanese pig isolates are mainly classified into two lineages, SC2 and SC4 [11,12]. However, the number of pig isolates used in these studies was insufficient to precisely clarify the ecology of MAH.
Most of the essential genes of MAH are thought to be mutual orthologs of genes in Mycobacterium tuberculosis (MTB) [13]. Although components of virulence systems have been investigated [14], reports about genome contents, even drug resistance genes are not available, despite the increasing incidence of MAH disease [1]. To understand MAH evolution and distribution, and to promote the identification of targets for antimicrobial drug discovery, characterization of the defining genomic features of MAH is essential.
Here, we obtained draft genome sequences of 30 MAH (NCBI:txid439334) isolates derived from pigs reared in Japan, and identified genome features for bacterial defense systems, such as restriction modification (RM) system, clustered regularly interspaced short palindromic repeat (CRISPR), tRNA arrays, virulence factors and drug resistance genes. The results from this study may enable greater understanding of the epidemiological relationship between MAH in humans and pigs.

Methods
Protocols for bacterial isolation and DNA extraction are available in a protocols.io collection ( Figure 1 [15]).

Sampling
MAH isolates were collected from pigs reared at two areas, Tokai and Hokuriku in Japan, where about 10% of pigs in Japan are reared. Forty-eight mesenteric or mandibular lymph

Bacterial isolation and DNA extraction
The method used for bacterial isolation is available in protocols.io [16]. Mesenteric or mandibular lymph nodes with mycobacterial granulomatous lesions were mixed with 400 μl of 2% NaOH and incubated at room temperature overnight. The samples were spread onto 2% Ogawa medium (Kyokuto Pharmaceutical, Tokyo, Japan) and incubated at 37°C for 3-4 weeks. A single colony was inoculated onto 7H11 broth with 10% oleic acid-albumin-dextrose-catalase as a supplement. The isolates were stored with Microbank (Pro Lab Diagnostics Inc., Richmond Hill, ON, Canada) at -80°C. The method of extraction of genomic DNA was also available in protocols.io [17]. In brief, cells were delipidated by treatment with acetone, then lysed by lysozyme and Proteinase K. Genomic DNA was extracted by phenol/chloroform treatment of the lysates.

Detection of bacterial defence systems (RM system and CRISPR CAS system) in the MAH genome
RM systems were determined using the online tool Restriction-ModificationFinder v.1. 1 [34] twice, with the following settings (1: database: All incl. putative genes, threshold for %ID: 90%, minimum length: 80% to search the RM system of MAH and 2: database: All, threshold for %ID: 10%, minimum length: 20% to confirm the orthologue of MTB or the other Mycobacteria) [35]. CRISPR-Cas9 systems were identified using the online tool CRISPRCasFinder v.4.2.2 [36] with default setting [37,38].

Detection of tRNA arrays in the MAH genome
The total number of tRNAs in this study were retrieved from gb files annotated by PGAP.
Draft genomes of GM17 and OCU479 isolates, which had more tRNAs than the others (Table 1), were inspected by tRNAscan-SE v.2.0 (RRID:SCR_010835) to search tRNA arrays [39]. tRNA gene isotype synteny (expressed by the single-letter amino acid code) of both isolates and the reference strains were aligned and used for the maximum likelihood method by MEGA 7.0. Classification of both isolates was conducted as previously described [40].

Detection of virulence factors and drug resistance genes
Virulence genes were identified using VFanalyzer (release 5) [41,42]. We selected the  [44]. To confirm the existence of mutations detected by RGI, we retrieved the respective drug resistance-associated genes from draft genome sequences, aligned by MEGA 7.0, and then manually checked for mutations in the nucleotide sequences.

Identification of MAH
The experimental workflow from sampling to identification is shown in Figure 2. We successfully obtained 13 MAH isolates derived from the Tokai area. Of these, 8 isolates (GM5-GM44), together with 22 isolates from Tokai and Hokuriku areas (OCU467-OCU486, Toy194 and Toy195) were used for draft genome sequence analysis. We conducted multiple examinations to determine the isolates as MAH, IS possession patterns, or sequence analysis of hsp65 [45]. Among MAH subspecies, the patterns of IS possession are different and are used for subspecies identification [46]. IS900 and IS901 are known to be indicators of MAP and MAA, respectively [22,23]. MAH is usually positive for IS1245 [47] and is negative for IS900, IS901 and IS902 [21]; however, MAH strains without IS1245 are frequently distributed in Japan [46,48]. In our study, 10/30 isolates were negative for IS1245 (33.3%) and none had IS900, IS901 and IS902 [45]. Subspecies of M. avium are also usually identified by hsp65 gene analysis, which had 17 single nucleotide polymorphisms (SNP) variations among subspecies [20]. MAH usually has 1, 2, 3, 7, 8 or 9 hsp codes [20]; however, five isolates had unclassified hsp codes (indicated by N) in this study [45]. Therefore, we also conducted partial sequence analysis of the rpoB gene and the isolates were identified as MAH by BLAST analysis. In addition, we conducted phylogenetic analysis based on hsp65 and rpoB genes retrieved from the draft genome, and all isolates in this study were also classified as MAH (Figure 3). All these examinations confirmed that our isolates were MAH.    [45]. All isolates shared the same two drug resistance genes: mtrA, which is associated with cell division and cell wall integrity [52] and resistance to macrolide antibiotics, and RbpA, which regulates bacterial transcription and is associated with rifampicin resistance [45,53]. In addition, SNPs associated with drug resistance were found.

Draft genome data
All isolates had a C117D change in the murA gene conferring resistance to fosfomycin. An A2274G mutation in the M. avium 23S rRNA, which contributes to macrolide resistance, was also detected by RGI, but when we examined the aligned nucleotide sequence, no point mutation was found in any isolates [45]. CRISPR, virulence factor and drug resistance genes were selected from online tools. Original databases of each tool used in this study were updated in 2020, suggesting our data are based on the forefront of existing knowledge.

tRNA arrays
tRNA arrays were detected in isolates GM17 and OCU479 (Table 3). A tRNA array was discovered in some MAH isolates in a previous study, and phylogenetic analysis based on nucleotide sequences of this tRNA array showed that the tRNA array of MAH was classified into a specific group [40]. Phylogenetic analysis was performed to confirm that the tRNA arrays in this study were authentic. Our tRNA arrays were classified into group 3, as defined in a previous study (Figure 4) [40].

RE-USE POTENTIAL
MAH is one of the most critical M. avium subspecies causing non-tuberculosis mycobacterial infection in human and pigs. Pigs are suspected to be the most dominant host of MAH in animals, and a potential source of infection for humans [7][8][9][10]. However, genomic studies on the relationship between human and pig MAH isolates are limited [11,12]. Our study provides 30 draft genome sequences of MAH isolated from pigs. These data will be useful for genome-based epidemiological studies to evaluate the importance of pigs as a source of infection. In addition, we provide molecular identification of defense