In silico characterization of chitin deacetylase genes in the Diaphorina citri genome

Chitin deacetylases (CDAs) are one of the least understood components of insect chitin metabolism. The partial deacetylation of chitin polymers appears to be important for the proper formation of higher order chitin structures, such as long fibers and bundles, which contribute to the integrity of the insect exoskeleton and other structures. Some CDAs may also be involved in bacterial defense. Here, we report the manual annotation of four CDA genes from the Asian citrus psyllid, Diaphorina citri, laying the groundwork for future study of these genes.

More recently, genomic and phylogenetic studies have shown that CDAs are present widely in insects and can be classified into five different groups [5,6]. Most holometabolous insects have at least one representative of each of the five CDA groups, while the hemimetabolous insects that have been examined lack group II and group V genes [6]. The exact role of insect CDAs is not well understood, but they may play a role in organization of chitin molecules into higher order structures [7].

Context
Loss of function experiments indicate that some CDAs are essential for growth and development, making them a potential target for insect pest control [6,[8][9][10][11][12]. Here we describe the chitin deacetylase gene family in the Asian citrus psyllid, Diaphorina citri (Hemiptera: Liviidae; NCBI:txid121845). D. citri is the vector for Candidatus Liberibacter asiaticus (CLas), which is responsible for the global outbreak of Huanglongbing (citrus greening) disease. We identified four chitin deacetylase genes in the D. citri v3 genome, Figure 1. Protocols.io protocol outlining the annotation process of the D. citri genome [16]. https://www.protocols. io/widgets/doi?uri=dx.doi.org/10.17504/protocols.io.bniimcce three of which have multiple isoforms. As in other hemipterans, only groups I, III and IV are represented [6].

METHODS
Chitin deacetylase genes in D. citri genome v3 [13] were identified by BLAST (NCBI BLAST, RRID:SCR_004870) search of D. citri sequences with chitin deacetylase orthologs from other insects. Orthology was confirmed by reciprocal BLAST of the National Center for Biotechnology Information (NCBI) non-redundant protein database [14]. Genes were manually annotated in Apollo 2.1.0 (RRID:SCR_001936) [15] using available evidence, including RNA-seq reads, Iso-seq transcripts and de novo-assembled transcripts. A more detailed annotation protocol is available at protocols.io ( Figure 1) [16].

DATA VALIDATION AND QUALITY CONTROL
Chitin deacetylase genes in the D. citri v3 genome [13] were identified and manually annotated as described below. These genes were classified following the precedents established in other insects [5,6].

Group I chitin deacetylases
Most insects have two group I genes named CDA1 and CDA2 (Table 3). The proteins encoded by these genes have an N-terminal chitin-binding domain (ChBD), a low-density lipoprotein receptor class A domain (LDLa), and a deacetylase catalytic domain [5]. RNA interference (RNAi) of group I CDAs in a variety of insects suggests that loss of function of CDA1 or CDA2 can result in lethality and therefore these genes could be potential targets for pest control Species, accession number, full name and abbreviated name are provided for all orthologs used in phylogenetic analysis.
As expected, we identified two group I genes in D. citri, which we named CDA1 and CDA2.
Both genes encode proteins with the typical group I domain structure ( Figure 3). We identified two isoforms each for D. citri CDA1 and CDA2 (Table 4). CDA2 has previously been shown to have multiple isoforms in several holometabolous insect species, with the transcripts differing only in the use of one alternative exon [5,10,12]. This gene structure is conserved in D. citri CDA2 with alternate exons 3a and 3b. The two D. citri CDA1 isoforms differ in the presence or absence of a 24-bp exon upstream of the last exon. Expression data from RNA-seq datasets available through CGEN [20] suggest that, in general, expression of CDA1 and CDA2 is higher in nymphs and eggs than in adults ( Figure 2A).
In Drosophila and Tribolium, the CDA1 and CDA2 orthologs are adjacent to one another in the genome [3,5] on chromosomes 3 and 5, respectively. The conserved clustering of these genes suggests there may be evolutionary constraint on their physical location. We found that the D. citri CDA1 and CDA2 orthologs are also adjacent to one another on chromosome 4. In the D. citri v3 genome, these genes are separated by approximately 50 kilobase pairs (Kb), although this distance appears to be inflated by falsely duplicated fragments of both genes in this assembly.

Group III chitin deacetylases
We identified one group III CDA in the D. citri v3 genome (Tables 3 and 4). This gene has been previously described and was named CDA3 because of its orthology to   Expression levels (transcripts per million, TPM) for annotated chitin deacetylase transcripts from available RNA-seq experiments used for Figure 2. All data are publicly available and were obtained from CGEN [20]. Developmental stage, citrus host, CLas (Candidatus Liberibacter asiaticus) infection status and tissue of each sample are provided in the first column. D. citri gene numbers were determined based on annotation of the D. citri genome v3. All other ortholog numbers were obtained from published sources [5,6,[24][25][26][27][28][29][30][31][32]]. An asterisk (*) indicates that isoforms have been found for at least one member of the group in that organism.

Group IV chitin deacetylases
Most insects examined to date have one group IV CDA, typically called CDA5 (CDA4 in N. lugens) ( Table 3). CDA5 has been shown to have multiple isoforms in Tribolium and Drosophila [5]. Consistent with these observations, we identified and annotated five different isoforms of CDA5 in D. citri (Table 4). Unfortunately, the annotated models are missing a small amount of 3′ sequence due to genome assembly issues. However, we

Other chitin deacetylase groups
We did not find any group II or group V CDAs in the D. citri v3 genome (Figure 4). To our knowledge CDAs from these groups have not been found in any hemipteran insects examined to date [6], so their absence in D. citri was expected. determine the expression pattern and function of specific isoforms. Our annotations will be incorporated into an updated official gene set and will be publicly available for comparative expression profiling on the CGEN [20].

DATA AVAILABILITY
The Diaphorina citri genome assembly, gene sets, and transcriptome data are accessible via the Citrus Greening website [20]. All accessions for genes used for phylogenetic analysis are provided within this report, and all additional data is available via the GigaScience GigaDB

EDITOR'S NOTE
This article is one of a series of Data Releases crediting the outputs of a student-focused and community-driven manual annotation project curating gene models and if required, correcting assembly anomalies, for the Diaphorina citri genome project [13].

ETHICAL APPROVAL
Not applicable.

CONSENT FOR PUBLICATION
Not applicable.

COMPETING INTERESTS
The authors declare that they have no competing interests.

AUTHORS' CONTRIBUTIONS
WBH, SJB, TD and LAM conceptualized the study; TD, SS, TDS and SJB supervised the study; SJB, TD, SS, and LAM contributed to project administration; SM, TDS, and BT conducted investigation; PH, MF-G, and SS contributed to software development; SS, TDS, PH, and MF-G developed methodology; SJB, TD, WBH, and LAM acquired funding; SM and TDS prepared and wrote the original draft; SS, WBH and SJB reviewed and edited the draft.