A phased, chromosome-scale genome of ‘Honeycrisp’ apple (Malus domestica)

The apple cultivar ‘Honeycrisp’ has superior fruit quality traits, cold hardiness, and disease resistance, making it a popular breeding parent. However, it suffers from several physiological disorders, production, and postharvest issues. Despite several available apple genome sequences, understanding of the genetic mechanisms underlying cultivar-specific traits remains lacking. Here, we present a highly contiguous, fully phased, chromosome-level genome of ‘Honeycrisp’ apples, using PacBio HiFi, Omni-C, and Illumina sequencing platforms, with two assembled haplomes of 674 Mbp and 660 Mbp, and contig N50 values of 32.8 Mbp and 31.6 Mbp, respectively. Overall, 47,563 and 48,655 protein-coding genes were annotated from each haplome, capturing 96.8–97.4% complete BUSCOs in the eudicot database. Gene family analysis reveals most ‘Honeycrisp’ genes are assigned into orthogroups shared with other genomes, with 121 ‘Honeycrisp’-specific orthogroups. This resource is valuable for understanding the genetic basis of important traits in apples and related Rosaceae species to enhance breeding efforts.

In fact, nine new cultivars derived from 'Honeycrisp' are already on the market.
Although critical for sustainable apple production, disease resistance has historically been less important because the market has been dominated by modern cultivars bred primarily for fruit quality and intensive conventional production systems [13]. Most apple cultivars grown commercially in the USA are susceptible to fungal diseases such as apple scab. In temperate and humid regions around the world, frequent applications of fungicides are necessary, contributing significantly to production costs, and to negative human health and environmental impacts [14]. 'Honeycrisp' is resistant to apple scab and, importantly, the ability of the fruits of this cultivar to retain crispness and firmness during storage is one of its most outstanding traits [15]. However, other 'Honeycrisp' production issues present challenges for apple growers ( Figure 1E-G). 'Honeycrisp' needs a carefully designed nutrient management program during the growing season for optimal production and fruit quality, especially to limit the occurrence of the physiological disorder bitter pit [5]. The total cullage of 'Honeycrisp' fruit is probably among the highest of apple cultivars. This is because of its susceptibility to various postharvest physiological disorders with poorly understood and complex etiologies. Such etiologies include bitter pit, soft scald, soggy breakdown, and CO 2 injury [18][19][20][21]. Postharvest technologies have been developed and deployed to mitigate these disorders [22][23][24]. However, factors affecting the efficacy of postharvest treatments include preharvest orchard management and at-harvest fruit maturity -key in the maintenance of postharvest apple fruit quality. Growers must balance the acquisition of certain fruit quality characteristics (e.g., size, color, flesh texture, and sugar content), while attempting to minimize risk for maturity-linked losses in quality that may occur in the supply chain [25]. This balancing act for maximizing at-harvest fruit quality and long-term cold storage potential in controlled atmospheres is especially difficult for 'Honeycrisp'.

CONTEXT
To maximize both our understanding of genetic mechanisms driving important 'Honeycrisp' traits, and to assist tree fruit breeders, high quality genomes are required [26].
Indeed, in the last decade since 'Golden Delicious' was sequenced [27], many genes and quantitative trait loci (QTL) linked to fruit disease resistance, quality traits, and abiotic stress tolerance in apples have been identified [7, 28,29]. Recent high-quality genomes of 'Gala', the double haploid 'Golden Delicious', and the triploid 'Hanfu' provide genomic resources for apple genetics and breeding [27,30,31]. These studies have identified targeted genomic regions for the development of diagnostic molecular markers to breed disease-resistant apple cultivars with good fruit quality [32]. However, traditional apple breeding is still resource-intensive and a time-consuming process [11,29,32]. Substantial gaps remain in our knowledge of the genetic mechanisms involved in many important apple traits. Here, we report a phased, chromosome-level genome assembly of the 'Honeycrisp' apple cultivar generated from Pacific Biosciences (PacBio) HiFi and Dovetail Omni-C technologies, plus a high-quality annotation, thus providing one of the most contiguous and complete genome resources available for apples to date. 30-hour movie times. Read length distribution and quality of all HiFi reads was assessed using Pauvre v0.1923 [33].

PacBio HiFi sequencing
To scaffold the genome using chromatin conformation sequencing, 1 g of flash-frozen young leaf material was harvested from 'Honeycrisp' trees at the Washington State University (WSU) Sunrise Research Orchard near Rock Island, WA USA and shipped to the HudsonAlpha Institute for Biotechnology in Huntsville, AL USA. The sequencing library was prepared using the Dovetail Genomics Omni-C kit and was sequenced on an Illumina NovaSeq 6000 with PE150 reads. A subset of 1 million read pairs was used as input for Phase Genomics hic_qc to validate the overall quality of the library [34].

Transcriptome sequencing
To facilitate gene annotation, total RNA was isolated from various tissues harvested from 'Honeycrisp', 'Red Delicious', and 'Granny Smith' apple trees grown at the WSU Sunrise Research Orchard near Rock Island, WA USA; 'Gala' and 'WA38' apple trees grown at the WSU and USDA-ARS Columbia View Research Orchard near Orondo, WA USA; and 'D'Anjou' pear trees grown at the WSU Tree Fruit Research and Extension Center Research Orchard in Wenatchee, WA USA using a modified CTAB/Chloroform extraction [46]. Total RNA was assessed for quality (RNA integrity number (RIN) ≥ 8) and purity (A260/280 > 1.8). Sources for all RNA are available in Table 1. Total RNA (2 μg) was used to construct Illumina TruSeq stranded libraries following manufacturers' instructions. Libraries were sequenced on an Illumina NovaSeq 6000 with PE150 reads at the HudsonAlpha Institute for Biotechnology in Huntsville, AL USA.

DATA VALIDATION AND QUALITY CONTROL A haplotype-phased chromosome-scale assembly
In total, nearly 55× coverage of PacBio HiFi reads and nearly 200× coverage of Dovetail Omni-C reads (Table 3)      joins were made to build the final assembly into 17 chromosomes, with 95.4% of the assembled sequence contained in the 17 pseudomolecules representing chromosomes.
Nineteen joins were made for HAP2, with 98.2% of the assembled sequence in the 17 pseudomolecules. Based on the Merqury k-mer analysis (Figure 3), the HAP1 assembly had a k-mer completeness of 82.7% (quality value [QV] 64.5), the HAP2 assembly 83% (QV 66.7), and the combined assemblies were 98.6% (QV 65.5) (Table 4). BUSCO completeness of HAP1 was 98.6% and HAP2 98.7%, suggesting high genome completeness for both haplomes, comparable or superior to other high quality apple genome assemblies ( Table 2). The two haplomes are structurally similar to each other (Figure 4). Compared with the assembly statistics of previously published apple genomes, the current 'Honeycrisp' assemblies are the most contiguous to date (Table 2).

Genome annotation
The yield of Illumina transcriptome sequencing data of fruit, leaves, and flower tissues of apples and pear ranged from approximately 9 to 27 gigabase pairs (Gbp) in flowers and leaf buds respectively (Table 1). Nearly 62% of both haplomes were annotated as repetitive DNA, mostly comprised of long terminal repeat (LTR) retrotransposons (   Honeycrisp' genome assemblies. k-mer multiplicity (x-axis) is plotted against k-mer counts (y-axis) to estimate the heterozygosity, copy numbers, sequencing depth, and completeness of a genome using Merqury v1.3 [45]. Colors in the plot represent the number of times each k-mer is found in the genome assembly.
for HAP1 and 97.4% for HAP2, the highest completeness among all publicly available Malus genome annotations (

Gene family analysis
Gene family evaluation was performed using PlantTribes 2 and its 26Gv2-scaffold orthogroup database, which contains representative protein coding sequences from most

RE-USE POTENTIAL
This fully phased, high-quality, chromosome-scale genome of 'Honeycrisp' apple will add to the toolbox for apple genetic research and breeding. It will enable genetic mapping, identification of genes, and development of molecular markers linked to disease, pest resistance, abiotic stress tolerance and adaptation, as well as horticulturally relevant harvest and postharvest fruit quality traits for use in apple breeding programs. Ultimately, the addition of high-quality genomic resources for 'Honeycrisp' can lead to enhanced orchard and supply chain management for many other apple cultivars, promoting future sustainability of the pome fruit industry.

DATA AVAILABILITY
The whole genome sequence data generated in this study have been deposited at the NCBI database under BioProject ID PRJNA791346. PacBio HiFi reads, and Hi-C reads are deposited in NCBI with the SRA accession number SAMN24287034 and SAMN29611953, respectively. Transcriptomic data generated in this study for genome annotation are deposited in NCBI with SRA accession numbers from SAMN29611954 to SAMN29611992. The Maldo.hc.v1a1 'Honeycrisp' genome assembly, gene annotation, and functional annotation for both haplomes can be accessed via the GigaScience GigaDB repository [73], and will be available in the Genomic Database for Rosaceae, which is currently in progress.

CONSENT FOR PUBLICATION
Not applicable.