Genomic organization of the human mucin gene MUC5B. cDNA and genomic sequences upstream of the large central exon.
Résumé
The complete structure of the DNA encoding the polypeptide chain of human mucin MUC5B has been determined. In this paper, we report the full-length cDNA (3886 bp) and genomic (15,143 bp) sequences upstream of the unusually large central exon of the human mucin gene MUC5B. This region, composed of 29 exons, encodes 1283 amino acid residues. Exon sizes vary from 44 to 262 bp, and intron sizes range from 87 to 1703 bp. We determined the 5'-end of MUC5B by performing rapid amplification of cDNA ends-polymerase chain reaction experiments leading to the same length of the amplified product and by using primer extension experiments. A putative translation start site was found at nucleotide +37. We compared the amino-terminal region of MUC5B with those of pro-von Willebrand Factor, MUC2 and MUC5AC, and animal mucins, RMuc2, PSM, and FIM-B.1. The primary amino acid sequence with a high content of cysteine residues demonstrates a high degree of similarity with other members of the 11p15 mucin gene family, particularly MUC5AC. The complete genomic organization and both full-length genomic and cDNA sequences of MUC5B have been elucidated. This gene contains 48 exons and encodes 5662 amino acid residues to give a polypeptide with a Mr approximately 600,000.