WO1999024065A1 - COMPOUNDS INHIBITING CD4-gp120 INTERACTION AND USES THEREOF - Google Patents

COMPOUNDS INHIBITING CD4-gp120 INTERACTION AND USES THEREOF Download PDF

Info

Publication number
WO1999024065A1
WO1999024065A1 PCT/US1998/023906 US9823906W WO9924065A1 WO 1999024065 A1 WO1999024065 A1 WO 1999024065A1 US 9823906 W US9823906 W US 9823906W WO 9924065 A1 WO9924065 A1 WO 9924065A1
Authority
WO
WIPO (PCT)
Prior art keywords
group
gpl20
side chain
binds
disrupts
Prior art date
Application number
PCT/US1998/023906
Other languages
French (fr)
Inventor
Peter D. Kwong
Wayne A. Hendrickson
Joseph G. Sodroski
Richard T. Wyatt
James M. Samanen
Original Assignee
The Trustees Of Columbia University In The City Of New York
Dana-Farber Cancer Institute, Inc.
Smithkline Beecham Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Trustees Of Columbia University In The City Of New York, Dana-Farber Cancer Institute, Inc., Smithkline Beecham Corporation filed Critical The Trustees Of Columbia University In The City Of New York
Priority to AU14545/99A priority Critical patent/AU1454599A/en
Publication of WO1999024065A1 publication Critical patent/WO1999024065A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/12Viral antigens
    • A61K39/21Retroviridae, e.g. equine infectious anemia virus
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K31/00Medicinal preparations containing organic active ingredients
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/12Viral antigens
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • A61P31/14Antivirals for RNA viruses
    • A61P31/18Antivirals for RNA viruses for HIV
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2740/00Reverse transcribing RNA viruses
    • C12N2740/00011Details
    • C12N2740/10011Retroviridae
    • C12N2740/16011Human Immunodeficiency Virus, HIV
    • C12N2740/16111Human Immunodeficiency Virus, HIV concerning HIV env
    • C12N2740/16134Use of virus or viral component as vaccine, e.g. live-attenuated or inactivated virus, VLP, viral protein

Definitions

  • HIV Human Immunodeficiency Virus
  • gpl20 Human Immunodeficiency Virus -1 envelope glycoprotein
  • AIDS acquired immunodeficiency syndrome
  • CD4 two cellular receptors of the human host
  • chemokine receptor primarily CXCR-4 or CCR-5, depending on viral strain
  • the gpl20 protein has been an obvious target for structural investigation, and quantities of pure soluble protein have been available for several years, a byproduct in part from vaccine trials. Nevertheless, despite considerable effort, it has resisted crystallographic analysis for more than a decade.
  • the mature gpl20 glycoproteins of different HIV-1 strains have approximately 470-490 amino acids (18) .
  • Extensive N-linked glycosylation at approximately 20-25 sites accounts for roughly half its mass (18,19) .
  • Sequences from many different viral isolates show that it contains five conserved regions (C1-C5) and five variable regions (V1-V5) (18, 20) and nine conserved disulfide bridges (19) .
  • proteolytic digestion does not reveal a sub-domain structure. Indeed, even after extensive proteolytic cleavage, the unreduced protein runs near its native molecular weight on SDS-PAGE (Peter D. Kwong : unpublished data) .
  • variable regions the V3 loop in particular, appear to be conformationally variable. Conformational change is also evidenced by shedding, the CD4-induced dissociation of gpl20 from the surface of the virus, and by ligand- induced variations in monoclonal antibody binding (21,22) . These changes may be related to the functional role of gpl20 in virus entry .
  • This invention provides a method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammal in need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupt two or more of the following interactions:
  • an alkyl group, R aromatic or heteraromatic group, Het, that binds to the side chain isobutyl (isopropyl or methyl group) of gpl20 isoleucine (valine or alanine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl (isopropyl or methyl group) of gpl20 isoleucine (valine, or alanine) 371 and CD4 phenylalanine 43;
  • Het an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43, wherein Het is phenyl, Bn,
  • X that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43, wherein X is hydroxyalkyl, hydroxyaryl , alkylamide, or arylamide;
  • an aromatic group or heteroaromatic group, Het that binds to the side chain indole group of gpl20 tryptophan 427 or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
  • an aromatic group or heteroaromatic group, Het that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
  • an aromatic group or heteroaromatic group, Het that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
  • an alkyl group, R that binds to the beta or gamma carbons of the side chain propionate of gpl20 gluta ic acid 370 or disrupts the hydrophobic interaction between the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43, wherein
  • R is alkyl, cycloalkyl, or haloalkyl
  • Het that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
  • Y is alkylammonium, d i a 1 ky 1 ammo n i um , a r y 1 amm o n i urn , arylalkylammonium, alkylguanidiniu , piperidinium, pyrollidinium, or pyridinium;
  • k an alkyl group, R, aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of gpl20 valine (or alanine) 430 or disrupts the hydrophobic interaction between the side chain isopropyl
  • valine (or alanine) 430 and the side ' chain guanidinium group of CD4 arginine 59;
  • a group, Z that binds to the side chain propylalcohol group of gpl20 threonine 123 or disrupts the side chain propylalcohol group of threonine 123 and the hydrogen bond interaction between the alpha carbonyl group of CD4 Arg 59, wherein Z is alkoxyalkyl , aryloxyalkyl , alkoxyaryl , haloalkyl , haloaryl , alkylamide, arylamide, alkylcarboxylate, arylcarboxylate, arylalkyl ester, dialky ester, or alkylarl ester.
  • This invention also provides a method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammalin need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupt two or more of the following interactions;
  • an aromatic group of heteroaromatic group, Het that binds to the alpha, beta or gamma carbons fo the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43;
  • an aromatic group or heteroaromatic group, Het that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
  • Het that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine
  • Het that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
  • Y that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 k y 1 a mm o n i u m , a r y 1 a mmo n i u , arylalkylammonium , alkylguanidinium , piperidinium, pyrollidinium, or pyridinium;
  • a group, X that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 with the side chain amide group of CD4 glutamine 40;
  • a group, Z that binds tot he alpha amino group of gpl20 glycine (alanine, or glutamic acid) 472 at a distance of 3.6 ang . , or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (alanine, or glutamic acid) 473 with the side chain propionamide group of CD4 glutamine 40;
  • m. a group, Z that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the side chain propionamide group of CD4 glutamine 40;
  • a group, X that binds to the alpha carbonyl group of gpl20 methionine (or serine) 426 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 methionine (or serine) 426 and the side chain hydroxyl group of CD4 serine 42;
  • X a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the alpha amino group of CD4 serine 42;
  • X that binds to the alpha amino group of gpl20 valine (or alanine) 430 or disrupts the hydrogen bond between the alpha amino group of gpl20 valine (or alanine) 430 and the side chain hydroxyl group of CD4 serine 42;
  • an alkyl group, R that binds to the isobutyl (or isopropyl) group of gpl20 isoleucine (or valine) 271 or disurpts the hydrophobic interaction between the isobutyl group of gpl20 isoleucine (or valine) 271 and the side chain hydroxypropl group of CD4 threonine 45;
  • al . a group, X that binds to the alpha amino group of gpl20 clycine 367 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 367 and the alpha carbonyl group of CD4 lysine 46;
  • Y that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang., or disrupts the hydrogen bond interaction between the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or seine) 280 and the side chain butylammoniu group of CD4 lysine 29;
  • dl . a group, Z that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang., or disrupts the hydrogen bond interaction betweeen the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
  • a group, X that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the alpha carbonyl group of CD4 glutamine 33;
  • a group, X that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the side chain amide group of CD4 glutamine 33;
  • gl . a group, X that binds to the alpha amino group of gpl20 glycine (or valine) 459 or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (or valine) 459 with the side chain propiona ido group of CD4 glutamine 33;
  • a group, X that binds to the alpha carbonyl group of gpl20 serine (or alanine) 365 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 serine (or alanine) 365 with the side chain amide of CD4 asparagine 52; and/or
  • This invention also provides a Method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammal in need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupts two or more of the following interactions:
  • a an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43 ;
  • an aromatic group or heteroaromatic group, Het that binds to the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43;
  • an aromatic group or heteroaromatic group, Het that binds to the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl
  • an aromatic group or heteroaromatic group, Het that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
  • Het that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine
  • an aromatic group or heteroaromatic group, Het that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
  • Y a group, Y, that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 k y 1 a mm o n i u m , a r y 1 a mm o n i u , arylalkylammonium , alkylguanidinium, piperidinium, pyrollidinium, or pyridinium.
  • a group, X that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 with the side chain amide group of CD4 glutamine 40;
  • a group, Z that binds tot healpha amino group of gpl20 glycine (alanine, or glutamic acid) 472 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (alanine, or glutamic acid) 473 with the side chain propionamide group of CD4 glutamine 40;
  • m. a group, Z that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the side chain propionamide group of CD4 glutamine 40;
  • n. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.4 ang., or disrupts the hydrogen bond betweenthe alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the alpha carbonyl group of CD4 glutamine 40;
  • a group, X that binds to the alpha carbonyl group of gpl20 methionine (or serine) 426 or disrupts the hydrogen bond betweenthe alpha carbonyl group of gpl20 methionine (or serine) 426 and the side chain hydroxyl group of CD4 serine 42;
  • X a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the alpha amino group of CD4 serine 42;
  • X that binds to the alpha amino group of gpl20 lysine 429 or disrupts the hydrogen bond interaction betweeen the alpha amino group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
  • X that binds to the alpha amino group of gpl20 valine (or alanine) 430 or disrupts the hydrogen bond between the alpha amimo group of gpl20 valine (or alanine) 430 and the side chain hydroxyl group of CD4 serine 42;
  • al . a group, X that binds to the alpha amino group of gpl20 glycine 367 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 367 and the alpha carbonyl group of CD4 lysine 46;
  • Y that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang., or disrupts the hydrogen bond interaction between the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 and the side chain butylammonium group of CD4 lysine 29;
  • cl . a group, Q that binds to the alpha methylene (or methine) group of gpl20 glycine (or valine) 459 at a distance of 3.1 ang., or disrupts the dipolar interaction between the alpha methylene (or methine) group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
  • dl . a group, Z that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
  • a group, X that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the alpha carbonyl group of CD4 glutamine 33;
  • gl . a group, X that binds to the alpha amino group of gpl20 glycine (or valine) 459 or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (or valine) 459 with the side chain propionamido group of CD4 glutamine 33;
  • a group, X that binds to the alpha carbonyl group of gpl20 serine (or alanine) 365 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 serine (or ' alanine) 365 with the side chain amide of CD4 asparagine 52; and/or
  • a group, X that binds to the side chain isopropylalcohol group of gpl20 threonine 123 at a distance of 3.8 ang., or disrupts the hydrogen bond between the side chain isopropyl alcohol group of gpl20 threonine 123 with the side chain ethyl alcohol group of CD4 serine 50.
  • kl an alkyl group, R, or an aromatic or heteroaromatic group that binds to the side chain phenyl group of gpl20 phenylalanine 382 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine
  • an aromatic or heteroaromatic group that binds to the side chain phenolic group of gpl20 tyrosine 384 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
  • ml an alkyl group, R, that binds to the side chain alkyl group of gpl20 valine (isoleucine, or glutamine) 255 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
  • Y that binds to the side chain carboxyl group of gpl20 glutamic acid 370 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine
  • CD4 is in the top left
  • gpl20 is toward the right
  • Fab 17b is in the bottom left of the figure.
  • Photomicrographs of crystals containing HIV-1 gpl20 Crystal types A-F are shown and correspond to the crystal types described in the text and Tables 3 and 4.
  • the photomicrograph in A is at twice the magnification.
  • the bar in A corresponds to 25 ⁇ m (50 ⁇ m for B-F) .
  • Lane 1 (Pharmacia Phast system) .
  • Lane 1 2.5 ug of ternary complex purified by gel filtration.
  • the top band is the deglycosylated ⁇ 82 ⁇ Vl/2* ⁇ V3 ⁇ C5 gpl20, the next two bands are the alkylated and reduced heavy and light chains respectively of the Fab 17b, and the bottom band is the two-domain sCD4 (D1D2).
  • Lane 2 standards: 94, 67, 43 (diffuse), 30, 20, and 14.
  • Lane 3 supernant from the crystallization droplet.
  • Lane 4 last wash of crystals.
  • Lane 5 dissolved crystals.
  • the gel is silver stained.
  • Figure 5 Figure 5
  • CD4 is toward the bottom and gpl20 is toward the top.
  • FIGS 6A and 6B The HIV-1 entry process.
  • the trimeric HIV-1 envelope glycoproteins anchored in the viral membrane, are depicted, with gpl20 in the lower right and gp41 in the upper right.
  • the gpl20 variable loops are not shown, but would extend over the outer surface of the envelope glycoprotein complex.
  • the receptors on the target cell, CD4 and chemokine receptor are also shown.
  • the structures of gpl20, gp41, and CD4 are adapted from available X-ray crystallographic studies (5,20,21), whereas the chemokine receptor model is hypothetical.
  • the molecular surface of the HIV-1 gpl20 core (20) is shown, with the arrow pointing towards the viral membrane.
  • the inner domain, believed to interact with gp41, and the outer domain, which is probably exposed on the assembled trimer, are on the left and right, respectively.
  • the gpl20 surface occluded by CD4 is shown and the gpl20 region thought to be involved in chemokine receptor binding (27) is also shown.
  • the location of the base of the V3 loop is shown.
  • Figure 7B conserveed gpl20 neutralization epitopes are shown on the gpl20 core, which is oriented identically to that in Figure 7A. The location of the epitopes was deduced from mutagenic analysis (45,46,48) .
  • V2, V3 , and V4 The approximate location of gpl20 structures (20) that contribute to protection from antibody responses is shown.
  • the relationship of different surfaces of the gpl20 core to the antibody response generated by the gpl20 glycoprotein is depicted.
  • the surface of gpl20 that interacts with neutralizing antibodies (32) is shown, spans the inner and outer domains, and includes the V2 and V3 variable loops (not shown) .
  • the surface of gpl20 that interacts with non-neutralizing antibodies is located on the inner domain, and includes gp41- interactive N- and C-terminal gpl20 regions (not shown) .
  • the heavily glycosylated surface of the gpl20 outer domain, which appears to be minimally immunogenic, is also shown.
  • the ribbon diagram shows gpl20, the N-terminal two domains of CD4 , and the Fab 17b (light chain) and (heavy chain) .
  • the sidechain of Phe 43 on CD4 is also shown.
  • the prominent CDR3 loop of the 17b heavy chain is evident in this orientation.
  • the complete N- and C- termini of gpl20 are missing, the positions of the gpl20 termini are consistent with the proposal that gp41, and hence the viral membrane, is located towards the top of the diagram. This would position the target membrane at the diagram base.
  • the vertical dimension of gpl20 in this orientation is roughly 50 A.
  • Precisely perpendicular views of gpl20 are shown in Figures 9 and 11. Drawn -with RIBBONS 49 .
  • core gpl20 Structure of core gpl20.
  • the orientation of gpl20 in each of the panels shown in this figure is related to Figure 8 by a 90° rotation about a vertical axis.
  • the viral membrane would be oriented above, the target membrane below, and the C-terminal tail of CD4 coming out of the page.
  • the left portion of core gpl20 as the "inner” domain
  • the right portion as the "outer” domain
  • the 4-stranded sheet at the bottom left of gpl20 as the "bridging sheet.”
  • the bridging sheet ( ⁇ 3, ⁇ 2, ⁇ 21, ⁇ 20) can be seen packing primarily over the inner domain, although some surface residues of the outer domain, e.g. Phe 382, reach in to form part of its hydrophobic core.
  • Ribbon diagram Helices and ⁇ -strands are depicted.
  • strand ⁇ l5 makes an antiparallel ⁇ -sheet alignment with strand C' of CD4.
  • the dashed line to the right of the diagram represents the disordered V4 loop. Selected parts of the structure are labeled.
  • Helices are shown as corkscrews and labeled ⁇ l-o-5.
  • ⁇ -strands are shown as arrows: black and labeled represent the 25 ⁇ -strands of core gpl20; gray and unlabeled represent the continuation of hydrogen bonding across a sheet; white and labeled represents the C' strand of CD4. Spatial proximity between neighboring strands implies mainchain hydrogen bonding.
  • Loops are labeled ⁇ A- ⁇ F and V1-V5. The labels of loops with high sequence variability are circled.
  • Solvent accessibility is indicated for each residue by an open circle if the fractional solvent accessibility is greater than 0.4, a half-closed circle if 0.1 to 0.4, and a closed circle if less than 0.1. Sequence variability observed among primate immunodeficiency viruses is indicated below the solvent accessibility by the number of horizontal hash marks: 1 mark, residues conserved among all primate immunodeficiency viruses; 2 marks, conserved among all HIV-1 isolates; 3 marks, exhibits moderate variation among HIV-1 isolates; and 4 marks, exhibits significant variability among HIV-1 isolates. In accessing conservation, all single atom changes were permitted as well as larger substitutions if the character of the sidechain was conserved (e.g. K to R or F to L) .
  • N-linked glycosylation is indicated by "m” for the high mannose additions and "c” for the complex additions observed in mammalian cells (6) .
  • Residues of gpl20 in direct contact with CD4 are indicated by "*".
  • Direct contact is a more restrictive criterion of interaction than the often used loss of solvent accessible surface; residues of gpl20 which show loss of solvent accessible surface but are not in direct contact are 123, 124, 126, 257, 278, 282, 364, 471, 475, 476 and 477. Parts (a) and (b) were drawn with MOLSCRIPT (P. J. Kraulis) .
  • FIG. 10B Electron density in the Phe 43 cavity.
  • the 2Fo-Fc electron density map at 2.5A, 1. l ⁇ contour, is shown.
  • the orientation is the same as in (a) .
  • the foreground has been clipped for clarity removing the overlying ⁇ 24- ⁇ 5 connection.
  • In the upper middle of the picture is the central unidentified density.
  • Phe 43 of CD4 can be seen reaching up to contact the cavity.
  • the gpl20 residues are Trp 427 (with its indole ring partially clipped by foreground slabbing), Trp 112, Val 255, Thr 257, Glu 370 (packing under the Phe 43 ring), lie 371, and Glu 368 (partially clipped in the bottom right corner) . Hydrophobic residues lining the back of the cavity can be partially glimpsed around the central unidentified density.
  • Electrostatic surface of CD4 and gpl20 The electrostatic potential is displayed at the solvent accessible surface, which is shaded according to the local electrostatic potential. The slight "puffiness" of the surface arises from the enlarged nature of the solvent accessible surface relative to the standard molecular surface.
  • the gpl20 surface is shown in an orientation similar to that of Figures 9A and 9C, but rotated ⁇ 20° around a vertical axis to depict the recessed binding pocket more clearly.
  • a thin yellow Ca worm of CD4 is shown to aid in orientation.
  • the CD4 surface is shown, rotated relative to the gpl20 panel by an exact 180° rotation about the vertical axis shown.
  • a thin red C ⁇ worm of gpl20 is shown.
  • CD4-gpl20 contact surface On the right, the gpl20 surface is shown with the surface within 3.5 A of CD4 (surface-to-atom center distance). This effectively creates an "imprint" of CD4 on the displayed gpl20 surface. On the left (180° rotation), the corresponding CD4 surface and gpl20 "imprint" is also shown.
  • the surface of gpl20 is shown with the surface of gpl20 residues shown by substitution to affect CD4 binding highlighted: substantial effect -- residues 257, 368, 370 and 427; moderate effect -- residue 457.
  • substantial effect -- residues 257, 368, 370 and 427 is also depicted.
  • residues important for gpl20 binding are shown on the CD4 surface: substantial effect -- residues 43 and 59; moderate effect — residues 29, 35, 44, 46, 47.
  • Sequence variability mapped to the gpl20 surface The sequence variability observed among primate immunodeficiency viruses ( Figure 9D) is depicted mapped onto the gpl20 surface. Also shown is the carbohydrate: N-acetylglucosamine and fucose residues present in the structure; Asn-proxi al N-acetylglucosamines modeled at residues 88, 230, 241, 356, 397, 406, 462. Much of the carbohydrate (22 residues) is hidden on the back side of the outer domain. Figure 10H
  • Phe 43 cavity The surface of the Phe 43 cavity is shown, buried in the heart of gpl20.
  • a worm representation of gpl20 shows the three stretches that are incorrectly predicted by secondary structure prediction: the ⁇ B loop, bending around the top of the cavity, strands ⁇ 20- ⁇ 21 just below the cavity, and strand ⁇ l5, slightly more distal to the cavity right.
  • the orientation shown here is the same as for the gpl20 surfaces in Figure 10C-10G.
  • Electrostatic surface The electrostatic potential is displayed at the solvent accessible surface, which is shaded according to the local electrostatic potential.
  • the electrostatic shading is the same scale as that shown in Figure IOC.
  • the surface that corresponds to the 17b epitope is the most electropositive region of the molecule.
  • the V3 loop is truncated here, but sequence analysis shows that it is generally quite positively charged.
  • FIG HE Worm diagram of gpl20 The gpl20 is shown shaded according to the same scheme given in Figure HA. The orientation is the same as in Figures HC and 11D, that is, 90° from Figure HA.
  • This conformational change strains the interactions at the N- and C- termini of gpl20 with the rest of the oligomer, priming the CD4-bound gpl20 core.
  • the chemokine receptor binds to the bridging sheet and the V3 loop (at the bottom left and right, respectively, of gpl20), causing an orientational shift of core gpl20 relative to the oligomer. This triggers further steps, which ultimately lead to the fusion of the viral and target membranes.
  • FIG. 13 Structure of HIV-1 gpl20 with neutralizing antibody and human receptor CD .
  • Figure 14B View of the molecular surface of the gpl20 outer domain, from the perspective indicated in Figure 14A.
  • the molecular surface in the figure on the left is shaded according to the variability observed in gpl20 residues among primate immunodeficiency viruses.
  • the variability of the gpl20 surface shown is underestimated since the V4 variable loop, which is not resolved in the structure, contributes to this surface.
  • the position of the V5 region is shown.
  • the highly conserved glycosylation site (asparagine 356 and threonine/serine 358) within the L e loop, between the V5 and V4 regions.
  • the V4 loop and the carbohydrates are modeled, as described in Materials and Methods .
  • Figure 14D View of the molecular surface of the gpl20 core inner domain.
  • variability is indicated by the shading scheme used in Figure 14B .
  • the CD4-binding site is to the right of the figure, and the protruding V1/V2 stem is indicated.
  • the conserved molecular surface, which is associated with the inner domain of the gpl20 core, is devoid of know N- linked glycosylation. These are modeled in the figure on the right, which is shaded as described in Figure 14B .
  • FIG 15A The molecular surface of the gpl20 core is shown, from the same perspective as that in Figure 14A.
  • the modeled N-terminal gpl20 core residues, V4 loop and carbohydrate structures are included.
  • the variability of the molecular surface is indicated, using the shading scheme described in Figure 14B .
  • the approximate locations of the V2 and V3 variable loops are indicated. Note the well-conserved surfaces near the "Phe 43" cavity and the chemokine receptor- binding site (see Figure 14A) .
  • FIG. 14A A Co. tracing of the gpl20 core, oriented similarly to Figure 14A.
  • the gpl20 residues within Figure 17A of the 17b CD4i antibody are shown.
  • the residues implicated in the binding of CD4BS antibodyies (20) are shown. Changes in these residues significantly affect the binding of at least 25 percent of the CD4BS antibodies listed in the table from the fourth series of experiments.
  • the residues implicated in 2G12 binding (19) are shown.
  • the V4 variable loop, which contributes to the 2G12 epitope, (19) is indicated by dotted lines (see figure 14A) .
  • the neutralizing face of the complete gpl20 glycoprotein includes the V2 and V3 loops, which reside adjacent to the surface shown (see Figure 15A) .
  • the approximate location of the gpl20 face that is poorly accessible on the assembled envelope glycoprotein trimer and therefore elicits only non-neutralizing antibodies (5 , 6) is shown.
  • the approximate location of an immulogically "silent" face of gpl20, which roughly corresponds to the highly glycosylated outer domain surface, is also shown.
  • a likely arragement of the HIV-1 gpl20 glycoproteins in a trimeric complex The gpl20 core was organized into a trimeric array, based on the criteria discussed in the text.
  • the perspective if from the target cell membrane, similar to that shown in Figure 14C.
  • the CD4 binding pockets are indicated by black arrows, and the chemokine receptor-binding regions are darkly shaded.
  • the lightly shaded areas indicate the more variable, glycosylated surface of the gpl20 core.
  • the approximate locations of the 2G12 epitopes are indicated by open arrows. The approximate locations for the V3 loops and V4 regions are shown.
  • the positions of the V5 regions and some complex carbohydrate addition sites are shown.
  • the approximate locations of the large V1/V2 loops, centered on the known positions of the VI/V2 stems, are indicated.
  • On one of the gpl20 subunits the positions of the L D and L E loops are indicated.
  • the distance of each of the gpl20 monomers from the 3 -fold symmetry axis is arbitrary.
  • the HIV gpl20 derivative used in the binding assay The wild-type gpl20 and gp41 envelope glycoproteins are shown in the upper figure. conserveed (black) and variable (white) regions (25) are indicated.
  • the N-terminal and V1/V2 deletions correspond to those previously described for the HXBc2 gpl20 mutants ⁇ 82 and ⁇ 128-194, respectively (8,9).
  • SIG signal peptide.
  • Figure 18 The gpl20-CCR5 binding assay.
  • FIG 18A The radiolabeled wt ⁇ protein was incubated either with the parental Ll .2 cells or with the L1.2-CCR5 cells. Incubations were carried out either in the absence or presence of sCD4 (lOOnM) . The wt ⁇ protein bound to the cells is shown. The two bands represent different glycoforms of gpl20.
  • the wt ⁇ protein was incubated with both sCD4 and 17b antibody at the indicated concentrations prior to adition to the L1.2-CCR5 cells.
  • the L1.2-CCR5 cells were incubated with 2D7 anti-CCR5 antibody or MIP-13 at the indicated concentrations prior to incubation with wt ⁇ -sCD4 complexes.
  • the wt ⁇ protein bound to the cells is shown.
  • FIG. 19 Structure of the HIV-1 gpl20 region implicated in CCR5 binding .
  • a ribbon drawing of the HIV-1 gpl20 glycoprotein (6) complexed with CD4 is shown. The perspective is that from the target cell membrane. The two amino-terminal domains of CD4 are shown. The gpl20 inner domain is shown, the outer domain is shown and the "bridging sheet" is shown. The gpl20 residues in which changes resulted in a >90% decrease in CCR5 binding are labeled. The V1/V2 stem and base of the V3 loop (strands l2 and j ⁇ l3 and the associated turn) are indicated.
  • Figure 19B A ribbon drawing of the HIV-1 gpl20 glycoprotein (6) complexed with CD4 is shown. The perspective is that from the target cell membrane. The two amino-terminal domains of CD4 are shown. The gpl20 inner domain is shown, the outer domain is shown and the "bridging sheet" is shown. The gpl20 residues in which changes resulted in a >90% decrease in CCR5 binding are labeled. The V1/V
  • a molecular surface of the gpl20 glycoprotein from the same perspective as that of Figure 19A is shown. Shaded surfaces are associated with gpl20 residues in which changes resulted in either a ⁇ 75% decrease, a > 90% decrease or a ⁇ 50% increase in CCR5 binding, when CD4 binding was at least 50% of that seen for the wt ⁇ protein.
  • the surface depicted in Figure 19B is shaded according to the degree of conservation observed among primate immunodeficiency viruses (25) .
  • the molecular surface of the gpl20 glycoprotein is shown, indicating residues in which changes resulted in a > 70% decease in 17b antibody binding, in the absence of sCD4.
  • the invention relates to a crystals of gpl20 suitable for x-ray diffraction.
  • the three dimensional structure of gpl20 provides information which has a number of uses; principally related to the development of pharmaceutical compositions which mimic the action of gpl20.
  • the essence of the invention resides in the obtaining of crystals of gpl20 of sufficient quality to determine the three dimensional (tertiary) structure of the protein by x-ray diffraction methods.
  • This invention provides crystals of sufficient quality to obtain a determination of the three-dimensional structure of gpl20 to high resolution, preferably to the resolution of 2.5 angstroms.
  • the value of crystals of gpl20 extends beyond merely being able to obtain a structure for gpl20.
  • the knowledge of the structure of gpl20 provides a means of investigating the mechanism of action of these proteins in the body. For example, binding of these proteins to various receptor molecules can be predicted by various computer models. Upon discovering that such binding in fact takes place, knowledge of the protein structure then allows chemists to design and attempt to synthesize molecules which mimic the binding of gpl20 to its receptors. This is the method of "rational" drug design.
  • One skilled in the art may use one of several methods to screen chemical entities for their ability to associate with gpl20. This process may begin by visual inspection of, for example, the active site on the computer screen based on the gpl20 coordinates. Docking may be accomplished using software such as Quanta and Sybyl , followed by energy minimization and molecular dynamics with standard molecular mechanics forcefields, such as CHARMM and AMBER.
  • Specialized computer programs may also assist in the process of selecting fragments or chemical entities. These include:
  • GRID [P.J. Goodford, "A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macromolecules” , J. Med. Chem. 28:849-857 (1985)]. GRID is available from Oxford Universit, Oxford, UK.
  • MCSS [A. Miranker and M. Karplus, "Functionality Maps of Binding Sites: A Multiple Copy Simultaneous Search Method", Proteins: Structure, Function and Genetics, 11:29-34 (1991)]. MCSS is available from Molecular Systems, Burlington, MA.
  • AUTODOCK [D.S. Goodsell and A. J. Olsen, "Automated Docking of Substrates to Proteins by Simulated Annealing", Proteins, Structure, Function, and Genetics, 195-202 (1990)] AUTODOCK is available from Scripps Research Institute, La Jolla, CA.
  • Assembly may be proceeded by visual inspection of the relationship of the fragments to each other on the three-dimensional image displayed on a computer screen in relation to the structure coordinates of gpl20. This would be followed by manual model building using software as Quanta or Sybyl.
  • CAVEAT [P.A. Bartell et al . , "CAVEAT: A Program of Facilitate the Structure-Derived Design of Biologically Active Molecules” , in Molecular Recognition in Chemical and Biological Problems", Special Pub., Royal Chem. Soc . 78, pp. 182-196 (1989)]. CAVEAT is available from the University of California, Berkeley, CA.
  • 3D Database systems such as MACCS-3D (MDL Information Systems, San Leandro, CA) . This area is reviewed in Y. C. Martin, "3D Database Searching in Drug Design", J. Med. Chem., 35:2145-2154 (1992).
  • inhibitory or other type of binding compounds may be designed as a whole or "de novo" using either an empty active site or optionally including some portion (s) of a known inhibitor (s) .
  • LUDI [H.-J. Bohm "The Computer Program LUDI : A New Method for the De Novo Design of Enzyme Inhibitors", J. Comp. Aid. Molec . Design, 6:61-78 (1992)]. LUDI is available from Biosym Technologies, San Diego, CA.
  • LEGEND [Y. Nishibata and A. Itai, Tetrahedron, 47:8985 (1991)]. LENGEND is available from Molecular Simulations, Burlington, MA.
  • the gpl20 or CD4 antagonist may be tested for bioactivity using standard techniques.
  • structure of the invention may be used in binding assays using conventional formats to screen inhibitors .
  • Suitable assays for use herein include, but are not limited to, the enzyme-linked immunosorben assay (ELISA) , or a fluoresence quench assay.
  • ELISA enzyme-linked immunosorben assay
  • fluoresence quench assay Other assay formats may be used; these assay formats are not a limitation on the present invention.
  • the gpl20 structure of the invention permit the design and identification of synthetic compounds and/or other molecules which have a shape complimentary to the conformation of the gpl20 active site of the invention.
  • the coordinates of the gpl20 structure of the invention may be provided in machine readable form, the test compounds designed and/or screened and their conformations superimposed on the structure of the invention. Subsequently, suitable candidates identified as above may be screened for the desired gpl20 inhibitory bioactivity, stability, and the like.
  • inhibitors may be used therapeutically or prophylactically to block gpl20 activity.
  • this invention also provides material which is the basis for the rational design of drugs which mimic the action of gpl20.
  • the subject invention provides a crystal suitable for X- ray diffraction comprising a polypeptide having an amino acid sequence of a portion of a Human Immunodeficiency Virus envelope glycoprotein gpl20.
  • the subject invention also provides the above-described crystals, which effectively diffract X-rays for determination of the atomic coordinates of the polypeptide to a resolution of 4 angstroms or better than 4 angstroms .
  • the subject invention also provides the above-described crystals, which effectively diffract X-rays for determination of the atomic coordinates of the polypeptide to a resolution of 2.5 angstroms or better than 2.5 angstroms .
  • the subject invention also provides the above-described crystals, wherein the portion of gpl20 comprises a CD4 binding site .
  • the subject invention further provides the above- described crystals, further comprising a compound bound to the CD4 site.
  • the subject invention also provides the above-described crystals, wherein the portion of gpl20 comprises a chemokine receptor binding site.
  • the subject invention also provides the above-described crystals, further comprising a compound bound to the chemokine receptor binding site.
  • the subject invention also provides the above-described crystals, wherein the portion of gpl20 comprises a CD4 binding site and a chemokine receptor binding site.
  • the subject invention also provides the above-described crystals, further comprising of a first compound bound to the CD4 binding site of the polypeptide and a second compound bound to the chemokine receptor binding site of the polypeptide.
  • the subject invention also provides the above-described crystals, wherein the first compound is the second compound.
  • the subject invention also provides the above-described crystals, wherein the polypeptide is a variant of gpl20 lacking the VI, V2 , V3 , and C5 regions.
  • the subject invention also provides the above-described crystals, wherein the gpl20 variant comprises a portion of the conserved stem of the V1/V2 stem- loop structure.
  • the subject invention also provides the above-described crystals, wherein the gpl20 variant comprises a portion of the base of the V3 loop.
  • the subject invention also provides the above-described crystals, wherein the gpl20 variant comprises a portion of the C5 region.
  • the subject invention also provides the above-described crystals, wherein the polypeptide is a variant of gpl20 with 5% by weight of the carbohydrate residues linked to the gpl20 in substantially the same manner as they are linked to gpl20 in unmodified gpl20.
  • the subject invention also provides the above-described crystals, wherein the polypeptide is a variant of gpl20 with 15% by weight of the carbohydrate residues linked to the gpl20 polypeptide in substantially the same manner as they are linked to gpl20 in unmodified gpl20.
  • the subject invention also provides the above-described crystals, further comprising a Fab, a CD4 , a polypeptide having amino acid sequence of a portion of CD4 , or a combination thereof, bound to the gpl20.
  • the subject invention also provides the above-described crystals, wherein the Fab is produced from an antibody to a discontinuous epitope.
  • the subject invention also provides the above-described crystals, wherein the monoclonal antibody is designated 17b.
  • the subject invention ' additionally provides a method for producing a crystal suitable for X-ray diffraction comprising: (a) deglycosylating a polypeptide having amino acid sequence of a portion of a gpl20 wherein said portion is produced by deleting or replacing part of the gpl20 to reduce the surface loop flexibility; (b) contacting the polypeptide with a ligand so as to form a complex which exhibits restricted conformational mobility; and (c) obtaining crystal from the complex so formed to produce a crystal suitable for X-ray diffraction.
  • the subject invention also provides the above-described methods, wherein the VI, V2 , or V3 loop of the gpl20 contained in the polypeptide are partially truncated, deleted or replaced.
  • the subject invention also provides the above-described methods, wherein the polypeptide lacks the VI, V2 , V3 and C5 loop of the gpl20.
  • the subject invention also provides the above-described methods, wherein the polypeptide also lacks up to fifty N-terminal amino acids of the gpl20 or up to fifty C- terminal amino acid of gpl20.
  • the subject invention also provides the above-described methods, wherein the ligand is a Fab, a CD4 , or a polypeptide having amino acid sequence of a portion of CD4.
  • the subject invention also provides the above-described methods, wherein the resulting polypeptide after the deglycosylation contains at least 5% of the carbohydrate.
  • the subject invention also provides the crystal produced by the above-described methods.
  • the subject invention also provides a method for identifying a compound capable of binding to a portion of Human Immunodeficiency Virus envelope glycoprotein gpl20 comprising: (a) determining a binding site on the portion of gpl20 based on the atomic coordinates computed from X-ray diffraction data of a crystal comprising the portion of gpl20; and (b) determining whether a compound would fit into the binding site, a positive fitting indicating that the compound is capable of binding to the gpl20.
  • the subject invention also provides a method for designing a compound capable of binding to a portion of Human Immunodeficiency Virus envelope glycoprotein gpl20 comprising: (a) determining a binding site on the portion of gpl20 based on the atomic coordinates computed from X-ray diffraction data of a crystal comprising the portion of gpl20; and (b) designing a compound to fit the binding site.
  • the subject invention also provides the above-described methods, wherein the fitting is determined by shape complementarity or by estimated interaction energy.
  • the subject invention also provides the above-described methods, wherein the X-ray diffraction data are set forth in Table A.
  • the subject invention also provides the above-described methods, wherein the atomic coordinates are set forth in Table B.
  • the subject invention also provides a pharmaceutical composition
  • a pharmaceutical composition comprising the compound identified by the above-described methods and a pharmaceutically acceptable carrier.
  • pharmaceutically acceptable carriers means any of the standard pharmaceutical carriers.
  • suitable carriers are well known in the art and may include, but not limited to, any of the standard pharmaceutical carriers such as a phosphate buffered saline solutions, phosphate buffered saline containing Polysorb 80, water, emulsions such as oil/water emulsion, and various type of wetting agents.
  • Other carriers may also include sterile solutions, tablets, coated tablets, and capsules.
  • Such carriers typically contain excipients such as starch, milk, sugar, certain types of clay, gelatin, stearic acid or salts thereof, magnesium or calcium sterate, talc, vegetable fats or oils, gums, glycols, or other known excipients.
  • excipients such as starch, milk, sugar, certain types of clay, gelatin, stearic acid or salts thereof, magnesium or calcium sterate, talc, vegetable fats or oils, gums, glycols, or other known excipients.
  • Such carriers may also include flavor and color additives or other ingredients.
  • Compositions comprising such carriers are formulated by well known conventional methods.
  • the subject invention also provides the above-described methods, wherein the compound is not previously known.
  • the subject invention also provides the compounds identified by the above-described methods.
  • the subject invention also provides the compound designed by the above-described methods.
  • the subject invention also provides a composition comprising the above-described compounds and a suitable carrier.
  • This invention provides a method of inhibiting the interaction of HIV-gpl20 with CD4 which comprises administering to a mammal in need thereof a compound, with the proviso that the compound is not CD4 , which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupts two or more of the following interactions: (a) a benzyl group that binds to the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine 371 at a distance of 3.4 ang.
  • a phenyl group tht binds to the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 at a distance of 3.4-3.5 ang.
  • said distance is the distance between nearest interacting heavy atoms in said groups of gpl20 and CD4 in the crystal structure. Said distances to comparable groups in other gpl20 isolates (shown in parentheses) have not been measured. Side chains do not include alpha carbons or alpha substituents.
  • a method of inhibiting the interaction of HIV-gpl20 with CD4 which comprises administering to a mammal in need thereof a compound, with the proviso that the compound is not CD4 , which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupts two or more of the following interactions:
  • a benzyl group that binds to the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 at a distance of 3.4 ang. or otherwise or disrupts the hydrophobic interaction between the side chain isobutyl (or isopropyl group) og gpl20 isoleucine (or valine) 371 and CD4 phenylalanine 43;
  • a benzyl group that binds to the side chain carboxylate group of gpl20 aspartic acid 368 at a distance of 3.1 ang. or otherwise or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine
  • a phenyl group that binds to the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 at a distance of 3.4-3.5 ang. or otherwise or disrupts the hydrophobic interactio nbetween the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43;
  • a phenyl group that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 at a distance of 3.1 ang. or otherwise or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
  • a propylguanidinium group that binds to the side chain carboxyl group of gpl20 aspartic acid 368 at a distance of 1.7 ang. or otherwise or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59;
  • a propylguanidinium group that binds to the side chain isopropyl (or methyl) group of gpl20 valine (or alanine) 430 at a distance of 3.3 ang. or otherwise or disrupts the ionic interaction between the side chain isopropyl (or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59; j . an amide group that binds to the side chain propylalcohol group of gpl20 threonine 123 at a distance of 4.6 ang. or otherwise or disrupts the hydrogen bond interaction between the side chain propylalcohol group of threonine 123 and the alpha carbonyl group of CD4 Arg 59;
  • a propionamide group that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 at a distance of 3.6 ang. or otherwise or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 with the side chain propionamid group of CD4 glutamine 40;
  • a propionamide group that binds to the alpha amino group of phl20 aspartic acid (or asparagine) 474 at a distance of 3.6 ang. or otherwise or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the side chain propionamide group of CD4 glutamine 40;
  • n. a methyl alcohol group that binds to the alpha amino group of gpl20 lysine 429 at a distance of 3.2 ang. or otherwise or disrupts the hydrogen bond interaction betweenthe alpha amino group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
  • a methyl alcohol group that binds to the alpha carbonyl group of gpl20 lysine 429 at a distance of 3.2 ang. or otherwise or disrupts the hydrogen bond interaction betweenthe alpha carbonyl group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
  • p. a methyl alcohol group that binds to the alpha carbonyl group of gpl20 tryptophan 427 at a distance of 3.2 ang. or otherwise or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 trptophan 427 and the side chain hydroxyl group of CD4 serine 42;
  • a methyl alcohol group that binds to the alpha amino group of gpl20 valine (or alanine) 430 at a distance of 3.4 ang. or otherwise or disrupts the hydrogen bond between the alpha amino group of gpl20 valine (or alanine) 430 and the sidechain hydroxyl group of CD4 serine 42;
  • r. a methyl alcohol group that binds to the alpha carbonyl group of gpl20 methionine (or serine 426 at a distance of 3.7 ang. or otherwise or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 methionine (or serine) 426 and the sidechain hydroxyl group of CD4 serine 42;
  • a butylammonium group that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang. or otherwise or disrupts the hydrogen bond interaction between the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 and the side chain butylammonium group of CD4 lysine 29;
  • cl an amido group that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang. or otherwise or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
  • dl an amido group that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 at a distance of 2.6 ang. or otherwise or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the alpha carbonyl group of CD4 glutamine 33; el . a propionamido group that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 at a distance of 4.2 ang. or otherwise or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the sidechain amide group of CD4 glutamine 33;
  • a propionamido group that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.9 ang. or otherwise or disrupts the hydrogen bond between the alpha amino group of gpl20 alycine (or valine) 459 with the sidechain propionamido group of CD4 glutamine 33; and/or
  • This invention also provides a method of inhibiting the interaction of HIV-gpl20 with CD4 which comprises administering to a mammal in need thereof a compound, with the proviso that the compound is not CD4 , capable of disrupting two or more of the contacts between gpl20 and CD4 as set forth in table C.
  • This invention also provides a method for identifying a compound capable of binding to the CD4 binding site of Human Immunodeficiency Virus envelope glycoprotein gpl20 comprising: (a) determining the CD4 binding site on the gpl20 based on the atomic coordinates computed from X- ray diffraction data of a crystal comprising a polypeptide having amino acid sequence of a portion of gpl20 capable of binding to CD4 ; and (b) determining whether a compound would fit into the binding site, a positive fitting indicating that the compound is capable of binding to the CD4 binding site of the gpl20.
  • This invention also provides a method for designing a compound capable of binding to the CD4 binding site of Human Immunodeficiency Virus envelope glycoprotein gpl20 comprising: (a) determining the CD4 binding site on the gpl20 based on the atomic coordinates computed from X- ray diffraction data of a crystal comprising a polypeptide having amino acid sequence of a portion of gpl20 capable of binding to CD4 ; and (b) designing a compound to fit the CD4 binding site.
  • This invention also provides the above-described methods, wherein the crystal further comprising a CD4 , a second polypeptide having amino acid sequence of a portion of CD4 , or a compound known to be able to bind to the CD4 site of the gpl20, bound to the polypeptide.
  • This invention also provides the above-described methods, wherein the fitting is determined by shape complementarity or by estimated interaction energy.
  • This invention also provides the above-described methods, wherein the X-ray diffraction data are set forth in Table A.
  • This invention also provides the above-described methods, wherein the atomic coordinates are set forth in Table B.
  • This invention also provides a pharmaceutical composition
  • a pharmaceutical composition comprising the compound identified the by above-described methods and a pharmaceutically acceptable carrier.
  • This invention also provides the above-described methods, wherein the compound is not previously known.
  • This invention also provides the compound identified by the above-described methods.
  • This invention also provides the compound designed by the above-described methods.
  • This invention also provides a composition comprising the above-described compounds and a suitable carrier.
  • This invention also provides a method of inhibiting Human Immunodeficiency Virus infection in a subject comprising adminstering effective of amount of the above-described composition to the subject.
  • the above-described compounds are nonpeptidyl.
  • This invention provides a method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammal in need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupt two or more of the following interactions:
  • an alkyl group, R aromatic or heteraromatic group, Het, that binds to the side chain isobutyl (isopropyl or methyl group) of gpl20 isoleucine (valine or alanine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl (isopropyl or methyl group) of gpl20 isoleucine (valine, or alanine) 371 and CD4 phenylalanine 43;
  • Het an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43, wherein Het is phenyl, Bn, EtPh, or heteroarylalkyl
  • X that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43, wherein X is hydroxyalkyl, hydroxyaryl, alkylamide, or arylamide;
  • an aromatic group or heteroaromatic group, Het that binds to the side chain indole group of gpl20 tryptophan 427 or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
  • an aromatic group or heteroaromatic group, Het that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
  • Het that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43 ;
  • an alkyl group, R that binds to the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43, wherein R is alkyl, cycloalkyl, or haloalkyl;
  • an aromatic group of heteroaromatic group, Het that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
  • Y that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 ky 1 ammo n i urn , a r y 1 a mm o n i urn , aryl alkyl ammonium , alkylguanidinium , piperidinium, pyrollidinium, or pyridinium; k. an alkyl group, R, aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of gpl20 valine
  • valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59;
  • a group, Z that binds to the side chain propylalcohol group of gpl20 threonine 123 or disrupts the side chain propylalcohol group of threonine 123 and the hydrogen bond interaction between the alpha carbonyl group of CD4 Arg 59, wherein Z is alkoxyalkyl, aryloxyalkyl , alkoxyaryl, haloalkyl, haloaryl, alkylamide, arylamide, alkylcarboxylate, arylcarboxylate, arylalkyl ester, dialky ester, or alkylarl ester.
  • This invention also provides a method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammalin need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupt two or more of the following interactions;
  • Het that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43;
  • an aromatic group of heteroaromatic group, Het that binds to the alpha, beta or gamma carbons fo the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43;
  • an aromatic group or heteroaromatic group, Het that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
  • an aromatic group or heteroaromatic group, Het that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43 ;
  • Het that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
  • Y that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 k y 1 a mm o n i u m , a r y 1 a mmo n i u m , arylalkylammonium , alkylguanidinium, piperidinium, pyrollidinium, or pyridinium;
  • an alkyl group, R aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of ghl20 valine (or alanine) 430 or disrupts eh hydrophobic interaction between the side chain isopropyl (or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59;
  • a group, X that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 with the side chain amide group of CD4 glutamine 40;
  • a group, Z that binds tot he alpha amino group of gpl20 glycine (alanine, or glutamic acid) 472 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (alanine, or glutamic acid) 473 with the side chain propionamide group of CD4 glutamine 40;
  • m. a group, Z that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the side chain propionamide group of CD4 glutamine 40;
  • a group, X that binds to the alpha carbonyl group of gpl20 methionine (or serine) 426 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 methionine (or serine) 426 and the side chain hydroxyl group of CD4 serine 42;
  • p. a group, X that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the alpha amino group of CD4 serine 42;
  • X that binds to the alpha amino group of gpl20 valine (or alanine) 430 or disrupts the hydrogen bond between the alpha amino group of gpl20 valine (or alanine) 430 and the side chain hydroxyl group of CD4 serine 42;
  • gpl20 isoleucine (or valine) 271 or disurpts the hydrophobic interaction between the isobutyl group of gpl20 isoleucine (or valine) 271 and the side chain hydroxypropl group of CD4 threonine 45;
  • al . a group, X that binds to the alpha amino group of gpl20 clycine 367 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 367 and the alpha carbonyl group of CD4 lysine 46;
  • Y that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang., or disrupts the hydrogen bond interaction between the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or seine) 280 and the side chain butylammonium group of CD4 lysine 29;
  • dl . a group, Z that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang., or disrupts the hydrogen bond interaction betweeen the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
  • a group, X that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the side chain amide group of CD4 glutamine 33;
  • gl . a group, X that binds to the alpha amino group of gpl20 glycine (or valine) 459 or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (or valine) 459 with the side chain propionamido group of CD4 glutamine 33;
  • a group, X that binds to the alpha carbonyl group of gpl20 serine (or alanine) 365 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 serine (or alanine) 365 with the side chain amide of CD4 asparagine 52 ;
  • This invention also provides a Method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammal in need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupts two or more of the following interactions: a. an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43;
  • an aromatic group or heteroaromatic group, Het that binds to the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43;
  • an aromatic group or heteroaromatic group, Het that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
  • an aromatic group or heteroaromatic group, Het that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine
  • Het that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
  • Y a group, Y, that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 k y 1 a mm o n i u m , a r y 1 a mmo n i urn , aryl alkyl ammonium , alkylguanidinium , piperidinium, pyrollidinium, or pyridinium.
  • an alkyl group, R aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of gpl20 valine (or alanine) 430 or disrupts the hydrophobic interaction between the side chain isopropyl (or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59; j.
  • a group, X that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 with the side chain amide group of CD4 glutamine 40;
  • a group, Z that binds tot healpha amino group of gpl20 glycine (alanine, or glutamic acid) 472 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (alanine, or glutamic acid) 473 with the side chain propionamide group of CD4 glutamine 40;
  • X a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the alpha amino group of CD4 serine 42;
  • X that binds to the alpha amino group of gpl20 lysine 429 or disrupts the hydrogen bond interaction betweeen the alpha amino group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
  • X that binds to the alpha amino group of gpl20 valine (or alanine) 430 or disrupts the hydrogen bond between the alpha amimo group of gpl20 valine (or alanine) 430 and the side chain hydroxyl group of CD4 serine 42;
  • X that binds to the alpha amino group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid 368 and the alpha carbonyl group of CD4 leucine 44;
  • al . a group, X that binds to the alpha amino group of gpl20 glycine 367 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 367 and the alpha carbonyl group of CD4 lysine 46;
  • Y that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang., or disrupts the hydrogen bond interaction between the side chain acetamide
  • dl . a group, Z that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
  • gl . a group, X that binds to the alpha amino group of gpl20 glycine (or valine) 459 or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (or valine) 459 with the side chain propionamido group of CD4 glutamine 33;
  • a group, X that binds to the alpha carbonyl group of gpl20 serine (or alanine) 365 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 serine (or alanine) 365 with the side chain amide of CD4 asparagine 52; and/or
  • j 1. an alkyl group, R, or an aromatic or heteroaromatic group that binds to the side chain indole group of gpl20 tryptophan (or phenylalanine) 112 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
  • kl an alkyl group, R, or an aromatic or heteroaromatic group that binds to the side chain phenyl group of gpl20 phenylalanine 382 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
  • an aromatic or heteroaromatic group that binds to the side chain phenolic group of gpl20 tyrosine 384 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
  • Y that binds to the side chain carboxyl group of gpl20 glutamic acid 370 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
  • rl an aromatic or heteroaromatic group that binds to the side chain phenolic group of gpl20 tyrosine 435 and/or disrupts the afroementioned inteactins of gpl20 with CD4 phenylalanine 43.
  • This invention also provides
  • This invention comprises compounds of formula (I) of formula (II) :
  • X is a group that is designed to mimic Arg- 59 in CD4 ;
  • Y is a group that is designed to mimic Phe-43 in CD4;
  • Z is -NRC(R (R 2 )C0-;
  • Q is -OH, -OR, -NR 1# R 2 or -NH-Lys-OR;
  • W is either or W is NR X R 2
  • R, R lr and R 2 are the same or different and are hydrogen, alkyl, Bn, EtPh, (cycloalkyl) alkyl , arylalkyl, heteroarylalkyl , haloalkyl, hydroxyalkyl, aminoalkyl, amino acid sidechain.
  • Y is Aryl(CH 2 )n-; Cyclohexyl (CH 2 ) n- . Unless otherwise indicated, the terms are defined as follows :
  • alkyl is used herein at all occurrences to mean a straight or branched chain radical of 1 to 6 carbon atoms, unless the chain length is limited thereto, including, but not limited to methyl, ethyl, n-propyl, isopropyl, n-butyl, sec- butyl , isobutyl, tert-butyl, and the like.
  • halo or halogen are used interchangeably herein at all occurrences to mean radicals derived from the elements chlorine, fluorine, iodine and bromine.
  • aryl or “heteroaryl” are used herein at all occurrences to mean substituted and unsubstituted aromatic ring(s) or ring systems which may include bi-or tri-cyclic systems and heteroaryl moieties, which may include, but are not limited to, heteroatoms selected from 0, N, or S.
  • Representative examples include, but are not limited to, phenyl, benzyl, naphthyl, pyridyl, quinolinyl, thiazinyl, and furanyl .
  • Ph is used herein at all occurrences to mean phenyl .
  • moiety Btd. is synthesized by the procedure described in Ngai et al., Tetrahedron, 1993, 49, 3577-3592, (incorporated herein by reference) .
  • This invention also provides the above-described vaccine, wherein the 6 or more amino acids are identical to the amino acids of naturally occurring gpl20.
  • This invention further provides the above-described vaccines, wherein the amino acids are within 1 angstroms of their distances in naturally occurring gpl20.
  • This invention also provides the above-described vaccines, wherein the amino acids are within 3 angstroms of their distances in naturally occurring gpl20.
  • This invention provides the above-described vaccines, wherein the amino acids are within 5 angstroms of their distances in naturally occurring gpl20.
  • This invention also provides the above-described vaccines, wherein the polypeptide is or is part of a conserved neutralization epitope.
  • This invention further provides the above-described vaccines, further comprising a carrier.
  • This invention also provides the above-described vaccines, further comprising an adjuvant.
  • This invention provides a vaccine comprising a polypeptide having 6 or more continuous amino acids from the Phe 43 cavity of gpl20.
  • This invention provides the above-described vaccines, wherein the polypeptide is or is part of an epitope a conserved neutralization epitope.
  • This invention also provides the above-described vaccines, further comprising a carrier.
  • This invention further provides the above-described vaccines, further comprising an adjuvant.
  • This invention further provides a vaccine comprising a polypeptide having 6 or more amino acids in the same spatial proximity to each other as the surface accessible amino acids adjacent to the Phe 43 cavity of naturally occurring gpl20.
  • This invention also provides the above-described vaccines, wherein the 6 or more amino acids are identical to the amino acids of naturally occurring gpl20.
  • This invention provides the above-described vaccines, wherein the amino acids are within 1 angstroms of their distances in naturally occurring gpl20.
  • This invention also provides the above-described vaccines, wherein the amino acids are within 3 angstroms of their distances in naturally occurring gpl20.
  • This invention further provides the above-described vaccines, wherein the amino acids are within 5 angstroms of their distances in naturally occurring gpl20.
  • This invention also provides the above-described vaccines, wherein the polypeptide is or is part of a conserved neutralization epitope.
  • This invention further provides the above -described vaccines, further comprising a carrier.
  • This invention also provides the above-described vaccines, further comprising an adjuvant.
  • This invention also provides the above-described vaccines, wherein the surface accessible amino acids comprise Lysine 432, Proline 369, and Threonine 373.
  • This invention further provides a vaccine comprising a polypeptide having 6 or more continuous surface accessible amino acids adjacent to the Phe 43 cavity of gpl20.
  • This invention also provides the above -described vaccines, wherein the polypeptide is or is part of a conserved neutralization epitope.
  • This invention further provides the above-described vaccines, further comprising a carrier.
  • This invention also provides the above-described vaccines, further comprising an adjuvant.
  • This invention further provides a method of inhibiting cell entry by HIV, comprising blocking or inhibiting the residues from 2 or more the sets of the CCR5 -binding residues set forth above, thereby inhibiting or preventing gpl20 from binding to CCR5 and thereby inhibiting cell entry by HIV.
  • This invention also provides the above described method wherein 3 or more the sets of the CCR5-binding residues set forth above are blocked or inhibited from interacting with CCR5.
  • This invention also provides the above described methods, wherein the blocking or inhibiting comprises contacting the CCR5-binding residues with an antibody.
  • Crystalline order is explicitly dependent on lattice homogeneity. Reducing heterogeneity can be thought of as increasing the proportion of surface area available for formation of lattice contacts, increasing the probability of crystallization.
  • the probability that a single lattice contact between two molecules is homogeneous is in part related to the fraction of surface that is homogeneous on one molecule multiplied by the fraction homogeneous on the other, that is, to:
  • the overall crystallization probability is related to: ⁇ (H- ⁇ ) 2*c , where the sum is over all possible lattices, "H” is the fraction of the surface which may form lattice contacts, “ ⁇ ” is a function of the size of the lattice contact and the degree of surface homogeneity - related to the occlusion of available surface area upon formation of each lattice contact as well as the spatial distribution of homogenous surface over the molecular surface; and "C” is the number of unique contacts required to make a set of symmetry-related molecules into a crystal lattice
  • the use of multiple variants of the same protein also increases the probability of crystal formation.
  • the overall probability of crystallization is exponentially related to the number of variants. Assuming independence of variants (a reasonable assumption with different protein ligands; not as valid with minor changes) with n variants and a probability of crystallization for each variant of P, the overall probability P ⁇ is:
  • the overall probability is 1 - (1-0.25) n ; with 15 variants, the probability increases to almost 99%.
  • the enhancement in overall probability is given by the ratio of (P ⁇ / P) - 1. If one tries many variants, and (1-P) n is much smaller than 1, then
  • the enhancement is related to the initial probability of crystallizing a single variant.
  • the more difficult a protein is to crystallize the more it benefits from this multiple variant strategy.
  • Gpl20 constructs The various recombinant gpl20 glycoporteins used for crystallization trials were produced in stable Drosophila Schneider 2 producer lines under the control of an inducible promoter as previously described (20) (Table 1
  • Sequence numbers refer to the translated gpl60, with the mature gpl20 beginning at +31. N-terminal sequencing showed that all constructs contained 4 additional amino acids, Gly-Ala-Arg-Ser, an artifact of the signal peptide cleavage. GAG here refers to the tripeptide, Gly-Ala-Gly, which was substituted for the removed amino acids.
  • the N-terminal two domains of CD4 (D1D2), residues 1-182, were produced in Chinese hamster ovary (CHO) cells and purified as described previously (21) .
  • Secreted gpl20 from Drosophila cells was purified by F105-Protein A affinity chromatography which used a glycine pH 2.8 elution step followed by immediate Tris base neutralization.
  • Fabs were produced by papain digestion of monoclonal antibodies. Briefly, the antibody was reduced in 100 mM DTT, 100 mM NaCl, 50 mM
  • PBS phosphate-buffered saline
  • alkylating solution PBS titrated to pH 7.5 with 2 mM iodoacetamide, 48hr
  • the reduced and alkylated antibody was concentrated to at least 2 mg/ml and digested with papain using the commercial protocol (Pierce) .
  • An additional gel filtration chromatographic step on a Superdex S-200 column (Pharmacia, FPLC) was added to ensure oligomeric homogeneity.
  • the gpl20 proteins were subject to protease digestion, papain, elastase, and subtilisin (Boehringer Manneheim) to assay for proteolytic susceptibility.
  • the gpl20 concentration was kept constant and the protease diluted serially (3.3x) from a ratio of 1:10 to 1:1000.
  • the digestion mix was incubated for 1 hr at 37° C and quenched by addition of 1% SDS (1:10 ratio) with immediate heating in boiling water for 2 minutes. Digestion products were analyzed with SDS- polyacrylamide gel electrophoresis (PAGE) with and without DTT reduction.
  • Carboxypeptidase Y digestion was used to analyze the C- terminus of gpl20. A 1:10 ratio of carboxylpeptidase Y (Boehringer Manneheim) to gpl20 was incubated for 1 hr at 37° C, pH 7.0. Even though digestion could not be easily seen by SDS-PAGE, the C-terminus of gpl20, HXBc2 strain, contains a number of positively charged amino acids, and the extent of the reaction could be monitored by native-PAGE.
  • Drosop ila-produced gpl20 proteins were deglycosylated enzymatically . Briefly, 0.5 mg/ml of gpl20 was incubated with various deglycosylating enzymes (singly or in combination) in 0.5 M NaCl, 100 mM Na Acetate, pH 5.7, for 10 hr at 37° C. Endoglycosidase D was used at a concentration of 0.1 U/ml, Endoglycosidase F at 0.25 U/ml, Endoglycosidase H at 0.25 U/ml, and Glycopeptidase F at 0.1 U/ml (all from Boehringer Manneheim) .
  • Monoclonal antibody binding assay The various gpl20 glycoproteins were assessed for recognition by a variety of monoclonal antibodies directed against both linear and discontinuous gpl20 epitopes by either immunoprecipitation (31) or by ELISA (32) .
  • the ELISA was performed with both fully glycosylated and deglycosylated ⁇ V1/2 ⁇ V3 glycoproteins immobilized on ELISA plates using a capture antibody specific for the gpl20 carboxyl-terminus, 6205 (International Enzymes) (32) .
  • Crystallization The vapor diffusion hanging droplet technique was used for all crystallizations. Small volumes, 0.5 ⁇ l protein solution + 0.5 ⁇ l reservoir solution, were used for virtually all crystallizations, screenings as well as final optimizations.
  • Crystal Screen I The Crystal Screen I (Hampton Research) was used, augmented by roughly 20 conditions which tested high protein concentrations (vapor diffusion concentration of the protein at various pHs) as well as mixtures of organic additives (2-5% MPD, PEG 400, or PEG 4000) combined with high ionic strength (2-4 M NaCl, Am 2 S0 4 or Na/K ⁇ O ) at pH 5.5-9.5.
  • high protein concentrations vapor diffusion concentration of the protein at various pHs
  • mixtures of organic additives 2-5% MPD, PEG 400, or PEG 4000
  • high ionic strength 2-4 M NaCl, Am 2 S0 4 or Na/K ⁇ O
  • Type E crystals were grown from the following conditions: Protein ( ⁇ 82 ⁇ V1/2* ⁇ V3 ⁇ C5 gpl20, two-domain CD4 (D1D2) , Fab 17b purified as a ternary complex on the Superdex S-200) ; Droplet (0.5 ⁇ l protein solution consisting of -10 mg/ml protein in gel filtration buffer + 0.4 ⁇ l droplet mix containing 0.1 M NaCitrate, 0.02 M NaHepes, 10% isopropanol, 8% PEG 5000 (Fluka) , 0.0075% SeaPrep Agarose (FMC BioProducts) , pH 6.4; Reservoir: (0.35 M NaCl, 0.1 M NaCitrate, 0.02 M Hepes, 10% isopropanol, 8% PEG 5000, pH 6.4) .
  • the droplet 0.5 ⁇ l protein solution consisting of -10 mg/ml protein in gel filtration buffer + 0.4 ⁇ l droplet mix containing 0.1 M Na
  • Lattice contacts are made solely at the molecular surface . Unlike small molecules, macromolecules have interiors -considerable surface, and hence crystallization, variability is tolerated while maintaining the same basic fold or even enzymatic abilities.
  • a prescient example that pre-dates the powerful methods of modern molecular biology was John Kendrew' s screening of myoglobins from many different organisms until he found one, from sperm whale, that crystallized well (37) . Indeed, human myoglobin requires a Lys to Arg mutation in order to produce crystals suitable for structural analysis (38) .
  • crambin is actually a mixture of two isoforms with sequence variation at internal residues (39) .
  • the molecular weight for the glycosylated gpl20 is approximately 90kDa; the deglycosylated gpl20, 60 kDa; and the deglycosylated ⁇ V1/2 ⁇ V3 gpl20, 47 kDa.
  • the ⁇ -terminus is resistant to proteolysis from +39 to +82, and thus probably adopts an ordered conformation. This number was calculated assuming only the C-terminal 19 and the ⁇ -terminal 8 amino acids were disordered.
  • Variants of gpl20 were developed through an iterative cycle which strove to eliminate heterogeneity.
  • the cycle involved recombinant production of gpl20 variants, deglycosylation, and then assessment of heterogeneity and flexibility by examination of glycosylation status, monoclonal antibody binding, and protease sensitivity, leading to the design of new constructs. For example. at the gpl20 C-terminus, protease digestion and native PAGE detected variability, and carboxyl peptidase Y digestion generated a 15-20 amino acid deletion which retained CD4 binding activity.
  • variable loops VI, V2 , and V3 were replaced. Little effect was found on CD4 binding (32,40,41).
  • Three constructs were made which contained deletions of the VI, V2 , and V3 loops (Table 1) . With ⁇ V1/2 ⁇ V3, the entire base and stem of the variable loops VI, V2 and V3 were excised. With ⁇ V1/2* ⁇ V3, the conserved stem of the V1/V2 stem-loop structure was retained, restoring the CD4-induced antibody epitopes in the presence of soluble CD4. With ⁇ Vl/2* ⁇ V3* the base of the V3 loop was retained as well, fully restoring CD4-induced antibody epitopes, even in the absence of soluble CD4.
  • Endoglycosidase H which has specificity for oligosaccharides with 5-9 mannoses, removed roughly 60% of the carbohydrate, and addition of Endoglycosidase D, which cleaves oligosaccharides with 3 or 4 mannoses, removed up to 90% of the carbohydrate .
  • CD4 Protein ligands, CD4 and the Fabs of monoclonal antibodies, were used in an attempt to reduce overall surface, and hence potential crystal lattice, mobility. This was complicated by the internal mobility of these ligands: CD4 has a flexible juncture between the second and third extracellular domains (42) , and Fabs have a conformationally mobile "elbow bend" between their variable and constant domains (43) .
  • CD4 we used a construct containing only the N-terminal two domains (1- 182) , for which there was previous success in structure determination (14) .
  • monoclonal antibodies we limited the crystallization screens to using only one Fab at a time, even though combinations with multiple Fabs were possible.
  • Fab F105 no crystals to the percent of N- linked sites cleaved by endoglycosidase D or H.
  • the "fully deglycosylated" protein still contains N-acetyl glucosamine and fucose moieties.
  • D1D2 sCD4 refers to two-domain soluble CD4. Antibody epitopes are described in the text.
  • small volume droplets were used, typically 0.5 ⁇ l of protein per crystallization trial. With small volumes, only 1-2 mg of protein were sufficient to evaluate each gpl20 crystallization variant. Smaller volumes were found to be more efficient at nucleation than larger droplets, perhaps due to higher surface tension effects resulting in greater variations in precipitant concentration, thereby permitting each droplet to sample a wider range of precipitant concentrations. Indeed, droplets that were "spread-out" also showed enhanced nucleation. This explanation may also account for the well-known observation that crystals frequently nucleate from the edges of crystallizaton droplets.
  • the initial crystallization screens produced six different types of crystals (Table 4) For crystal types A-D, extensive optimization was unable to produce single crystals large enough to be characterized.
  • crystal types E and F single crystals of needle morphology could be grown. With type E crystals, the needle axis was coincident with the a axis, with the cross-section perpendicular to the needle axis a rhombehedron bounded by faces of the form (0 1 1) and (0-1 1) . These could be distinguished from type F crystals, where the cross- section was hexagonal.
  • Single crystals of type E and F were analyzed for diffraction in capillary mounts. Only type E crystals showed diffraction. Gel electrophoresis of these crystals demonstrated that they contained gpl20, D1D2 and Fab 17b ( Figure 4) .
  • D1D2 sCD4 refers to the two domain soluble CD4. ** The protein concentration is given as the absorbance (280 nm) of the complex per ml of solution. *** Most of the reservoirs are conditions from Crystal Screen 1 (Hampton Research) ; the reagent numbers given here refer to the crystallization reagent from this commercial kit. Hanging droplets were 0.5 ⁇ l protein (in 0.35 M NaCl, 5 mM Tris pH 7.0, 0.02% NaN 3 ) + 0.5 ⁇ l reservoir, except for crystal type B, which used 0.5 ul of 3 -fold diluted reservoir.
  • Crystallization reservoirs were 500 ⁇ l; an additional 35 ul of 5 M NaCl was added after the droplet was mixed to compensate for the NaCl in the protein solution. All dilutions used H 2 0, except for crystal type F, where 22.5% isopropanol was used. Crystallizations were setup at room temperature and incubated at 20 °C.
  • Table 6 which follows, shows the critical residues in gpl20 for interactions with CD4. Table 6. Critical Residues in GP120 for Interactions with CD4
  • Kelders H. A., Kalk, K. H. , Gros, P., and . G., H.
  • the human immunodeficiency viruses (HIV-1 and HiV-2) and simian immunodeficiency viruses (SIV) are the etiologic agents of acquired immunodeficiency syndrome (AIDS) in their respective human and simian host (1) .
  • infection with primate immunodeficiency viruses is characterized by an initial phase of high-level viremia, followed by a long period of persistent virus replication at a lower level (2) .
  • Viral persistence occurs despite specific antiviral immune responses, which include the generation of neutralizing antibodies.
  • the primate immunodeficiency viruses like all retroviruses, are surrounded by an envelope consisting of a host cell-derived lipid bilayer and virus-encoded envelope glycoproteins (3) .
  • the viral membrane To enter host cells, the viral membrane must be fused with the plasma membrane of the cell, a process mediated by the envelope glycoproteins.
  • the exposed location of these proteins on the virus allows them to carry out their function but also renders them uniquely accessible to neutralizing antibodies.
  • dual selective forces, virus replication and immune pressure have shaped the evolution of the envelope glycoproteins and continue to do so within each infected host.
  • the envelope glycoproteins are synthesized as approximately 845-870 amino acid precursor in the rough endoplasmic reticulum. N- linked, high- mannose sugar chains are added to form the gpl60 glycoprotein, which assembles into oligomers (4-6) . The preponderance of evidence suggests that these oligomeric complexes are trimers (4,5) .
  • the gpl60 trimers are transported to the Golgi apparatus, where cleavage by a cellular protease generates mature envelope glycoproteins: gpl20, the exterior envelope glycoprotein, and gp41, the transmembrane glycoprotein (3) .
  • the gp41 glycoprotein possesses an ectodomain that is largely responsible for trimerization (7) , a membrane-spanning anchor, and a long cytoplasmic tail. Most of the surface-exposed elements of the mature, oligomeric envelope glycoprotein complex are contained on the gpl20 glycoprotein. Selected, presumably well-exposed, carbohydrates on the gpl20 glycoprotein are modified in the Golgi apparatus by the addition of complex sugar (6) . The gpl20 and gp41 glycoproteins are maintained in the assembled trimer by non-covalent , somewhat labile interactions between the gp41 ectodomain and discontinuous structures composed of N- and C-terminal gpl20 sequences (8) .
  • Virus attachment also involves the interaction of the gpl20 envelope glycoproteins with specific receptors, the CD4 glycoprotein (11) and members of the chemokine receptor family (12, 13) (Fig. 6) .
  • the CD4 glycoprotein is expressed on the surface of T lymphocytes, monocytes, dendritic cells, and brain microglia, the main target cells for primate immunodeficiency virus in vivo. The requirement for CD4 binding exhibited by most primate immunodeficiency viruses for efficient entry is consistent with this observed in vivo tropism.
  • CD4 binding A major function of CD4 binding is to induce conformational changes in the gpl20 glycoprotein that contribute to the formation and/or exposure of the binding site for the chemokine receptor (13, 14).
  • the use of CD4 as a receptor may have evolved subsequently, allowing the high-affinity chemokine receptor-binding site of primate immunodeficiency viruses to be sequestered from host immune surveillance.
  • the more conserved regions fold into a gpl20 core which has been recently crystallized in a complex with fragments of CD4 and a neutralizing antibody (20) .
  • the gpl20 core is composed of two domains, an inner domain and an outer domain (Fig. 7a) . These names reflect the likely orientation of gpl20 in the assembled envelope glycoprotein trimer: the inner domain faces the tri er axis and, presumably, gp41, while the outer domain is mostly exposed on the surface of the trimer. Elements of both domains contribute to CD4 binding.
  • CD4 binds in a recessed pocket on gpl20, making extensive contact over approximately 800 A° 2 of the gpl20 surface.
  • a shallow cavity is filled with water molecules, while a deep cavity extends 10-15 A° into the interior of gpl20.
  • the opening of this deep cavity is occupied by phenylalanine 43 of CD4 , which has been shown by mutagenic analysis to be critical for gpl20 binding (21) .
  • Most of the gpl20 residues previously identified as important for CD4 binding (22,23) surround the opening of the deep cavity and contribute to interactions with phenylalanine 43 of CD4.
  • aspartic acid 368 of gpl20 forms a salt bridge with arginine 59 of CD4 , also shown by mutagenesis to be important for gpl20 binding (21) .
  • mainchain atoms on gpl20 and CD4 form hydrogen bonds bridging the two proteins.
  • the formation of the deep cavity in gpl20 likely contributes to the transmission of CD4-induced conformational changes to gpl20 elements involved in the interaction with chemokine receptors and/or gp41.
  • the deep cavity may be a useful target for intervention by small molecular weight compounds .
  • CCR5 chemokine receptors
  • V3-deleted versions of gpl20 do not bind CCR5, even though CD4 binding occurs at wild-type levels (14) .
  • Antibodies against the V3 loop interfere with gpl20-CCR5 binding (14) .
  • These results support an involvement of the V3 loop in chemokine receptor binding.
  • Other, conserved gpl20 structures also appear to play an important role in chemokine receptor binding.
  • Antibodies that recognize conserved, discontinuous gpl20 epitopes that are more exposed after CD4 binding are potent inhibitors of gpl20-CCR5 interaction (14) . These CD4-induced (CD4i) epitopes are discussed further below.
  • Recent mutagenic and structural analysis have revealed the existence of a highly conserved gpl20 structure that is important for CCR5 binding (20,27) (Fig. 7, a and b) . This structure is adjacent to the V3 loop and the CD4i epitopes, and is oriented to face the target cell upon gpl20-CD4 binding.
  • the gp41 ectodomain structures reveal an extended, trimeric coiled coil that could potentially bridge the viral and target cell membranes (5) .
  • Interactions of other gp41 helical segments near the membrane-spanning region with the interhelical grooves of the internal coiled coil are important for fusion-related conformational changes in gp41. This interaction can be inhibited by helical peptides that mimic either of the involved gp41 helices
  • HIV-1 envelope glycoproteins as antigens.
  • the success of these viruses in achieving persistent infections implies that the viral envelope glycoproteins have evolved to be less-than-ideal immunogens and antigens.
  • Structures on the viral envelope glycoproteins that are conserved among diverse viral strains are, in general, poorly exposed to the humoral immune system.
  • the moieties involved in gpl20-gp41 association are buried in the interior of the functional envelope glycoprotein spike (18, 31, 32) .
  • the CD4 binding sites is recessed, flanked by variable regions exhibiting considerable glycosylation
  • HIV-1 viruses that have been passaged in immortalized cell lines are typically more sensitive to neutralization by antibodies or soluble CD4 than are primary, clinical isolates (34) .
  • a major determinant is the structure of the gpl20 major variable loops, V1/V2 and V3 (35) .
  • V1/V2 and V3 variable loops of a laboratory-adapted virus with those of a neutralization-resistant primary isolate creates a virus similar to the parental primary virus (35) .
  • the basis for the decreased sensitivity of primary HIV-1 isolates to neutralization appears to involve a decreased exposure of the relevant gpl20 epitopes to soluble CD4 or antibody.
  • the temporal pattern of the antibody response to HIV-1 infection The noncovalent nature of the association between gpl20 and gp41 contributes to the lability of the functional envelope glycoprotein trimer (8,9).
  • the interactive regions of gpl20 and gp41 are particular immunogenic (37) .
  • the cognate antibodies cannot bind the assembled, functional envelope glycoprotein complex, they do not exhibit neutralizing activity.
  • antibodies against the envelope glycoproteins typically can be detected in the sera of HIV-1-infected individuals by two-three weeks after infection, most of these antibodies lack the ability to inhibit virus infection. By the time that neutralization antibodies are efficiently elicited, HIV-1 is firmly established in the host.
  • neutralizing antibodies can be detected in the sera of infected animals or humans (38) . These antibodies neutralize the infecting virus but often exhibit little of no activity against other stains of virus. A subset of these strain- restricted antibodies recognize the HIV-1 V3 loop (38) . These antibodies can block chemokine receptor binding
  • variable gpl20 elements can contribute to the epitopes recognized by the strain-restricted neutralizing antibodies. It is known, for example, that antibodies directed against the gpl20 V2 loop can also exhibit neutralizing activity (39) .
  • the V2 loop- associated neutralization epitopes are typically conformation-dependent.
  • the ability of some V2-or V3- directed antibodies to recognize more than one HIV-1 strain (39,40) suggests that these major variable loops assume a finite number of conformations. This is consistent with the functional consequences on virus entry of some changes in these variable structures (41) , and with the observation that amino acid substitutions in the variable loops are not random (42) .
  • the requirement for chemokine receptor binding probably constrains V3 loop variation.
  • the V2 loop although dispensible for the replication of some HIV-1 viruses in culture (33) , helps protect the V3 loop and the conserved epitopes near the chemokine receptor binding site from neutralizing antibodies.
  • the V2 and V3 loops reside proximal to the chemokine receptor binding site (Fig. 7) , masking more conserved gpl20 elements and presenting potentially variable epitopes to the immune system.
  • the gpl20 residues important for antibody binding are all located within the CD4 -binding pocket on gpl20 (Fig. 7b) , and several of the most important residues are near the opening of the deep cavity (20) . Therefore, some broadly neutralizing antibodies can apparently access the more recessed elements of the CD4 binding pocket. This is consistent with the observation that the gpl20-CD4 interface is as large as that of a typical antibody- antigen complex (20) .
  • CD4i CD4-induced epitopes
  • the CD4i epitopes are located near conserved gpl20 structures important for chemokine receptor interaction (14) (Fig. 7b) .
  • CD4 binding has been shown to cause a change in the V2 loop conformation that allows better CD4i epitope exposure (33) .
  • the antibodies recognizing the CD4i epitopes must bypass the overlapping V2 and V3 loops (33) . Indeed, as is evident in the current crystal structure (20) , this is accomplished by the protrusion of the CDR3 loop of the antibody heavy chain.
  • Antibodies against CD4i epitopes need to bind viruses before CD4 binding occurs to achieve neutralization (47) .
  • the reason is that once the envelope glycoprotein complex binds cell surface CD4 , there are severe steric constraints on the binding of an antibody to the gpl20 surface facing the target cell (Fig. 6) .
  • Another fairly conserved gpl20 neutralization epitope is recognized by the 2G12 antibody (48) .
  • the 2G12 antibody Unlike the other characterized HIV-1 neutralizing antibodies, which recognize gpl20 structures near or within the receptor- binding sites, the 2G12 antibody apparently binds an epitope in the outer domain (Fig. 7b) . Given the variability in this outer domain, the ability of the 2G12 antibody to neutralize a fair number of HIV-1 strains (48) seems paradoxical.
  • the marked sensitivity of 2G12 antibody may recognize more conserved carbohydrate structures formed as a result of the heavy concentration of N-linked glycosylation in the gpl20 outer domain.
  • the apparent rarity with which 2G12-like antibodies are elicited attests to the success of the viral strategy of employing a heavily glycosylated outer domain surface in immune evasion.
  • the HIV-1 envelope glycoproteins as vaccine components. That the human and simian immunodeficiency virus envelope glycoproteins are not ideal immunogens is an expected consequence of the immunological selective forces that drove the evolution of these viruses.
  • the same features of the envelope glycoproteins that dictate poor immunogenicity in natural infections have hampered vaccine development.
  • the lability of envelope glycoprotein complex has frustrated attempts to present oligomers mimicking the functional spike to the immune system.
  • the disintegration of envelope glycoprotein oligomers contributes to the preferential elicitation of non-neutralizing antibodies by the newly exposed gpl20 N- and C-termini.
  • variable loops elicit the majority of neutralizing antibodies, probably due to the exposed nature of these epitopes. It is still unclear whether conserved features in the V2 and V3 variable loops exist that can be exploited in vaccine design, or whether all possible functional configurations of these variable structures need to be represented in a cocktail of immunogens .
  • the discontinuous gpl20 structures surrounding the receptor binding sites exhibit a relatively high degree of conservation (20), in keeping with the minimal polymorphism in the host cell receptors.
  • the CD4 binding site contributes a particularly attractive target. It appears to be accessible to antibodies, more so than the conserved elements of the chemokine receptor-binding region. A large fraction of the broadly neutralizing antibodies that eventually appear in HIV-1-infected individuals is directed against the CD4 binding site (43), indicating that ability of the human immune system to recognize this gpl20 region and to generate an appropriate response. Nonetheless, these antibodies have been difficult to elicit in animals and vaccinated humans
  • HIV-1 envelope glycoproteins have evolved to be inefficient at eliciting effective antiviral antibody responses.
  • the availability of structural information on the conserved HIV-1 gpl20 neutralization epitopes should facilitate the modification of this important antigen and allow the rational testing of hypotheses regarding its poor immunogenic properties. These efforts should complement ongoing efforts to improve antigen presentation to the immune system and to create suitable animal models for the screening of vaccine candidates .
  • HIV human immunodeficiency virus
  • gpl20 The entry of human immunodeficiency virus (HIV) into cells requires sequential interactions of the viral exterior envelope glycoprotein, gpl20, with the CD4 glycoprotein and a chemokine receptor on the cell surface. These interactions initiate a fusion of the viral and cellular membranes.
  • gpl20 can elicit virus-neutralizing antibodies, HIV eludes the immune system.
  • the structure reveals a cavity-laden CD4-gpl20 interface, a conserved binding site for the chemokine receptor, evidence for conformational change upon CD4 binding, the nature of a CD4-induced antibody epitope, and specific mechanisms for immune evasion.
  • Our results provide a framework for understanding the complex biology of HIV entry into cells and will guide efforts to intervene.
  • HIV-1 and HIV-2 and the related simian immunodeficiency viruses (SIV) cause the destruction of CD4+ lymphocytes in their respective hosts, resulting in the development of acquired immunodeficiency syndrome (AIDS) (1, 2) .
  • AIDS acquired immunodeficiency syndrome
  • the entry of HIV into host cells is mediated by the viral envelope glycoproteins, which are organized into oligomeric, probably trimeric, spikes displayed sparsely on the surface of the virion. These envelope complexes are anchored in the viral membrane by the gp4l transmembrane envelope glycoprotein.
  • the surface of the spike is composed primarily of the exterior envelope glycoprotein, gpl20, associated by noncovalent interactions with each subunit of the trimeric gp41 glycoprotein complex (3, 4.)
  • V1-V5 variable regions
  • the first four variable regions form surface-exposed loops that contain disulfide bonds at their bases6.
  • the conserved gpl20 regions form discontinuous structures important for the interaction with the gp41 ectodomain and with the viral receptors on the target cell. Both conserved and variable gpl20 regions are extensively glycosylated6.
  • the variability and glycosylation of the gpl20 surface likely modulate the immunogenicity and antigenicity of the gpl20 glycoprotein, which is the major target for neutralizing antibodies elicited during natural infection (7) .
  • gpl20 envelope glycoprotein binds to the CD4 glycoprotein, which serves as the primary receptor.
  • the gpl20 glycoprotein binds to the most amino-terminal of the four immunoglobulin- like domains of CD4. Structures of both the N-terminal two domains (8, 9) and the entire extracellular portion of CD410 have been determined, and mutagenesis studies indicate that the CD4 structure analogous to the second complementarity-determining region (CDR2) of immunoglobulins is critical for gpl20 bindingll, 12. conserveed gpl20 residues important for CD4 binding have likewise been identified by mutagenesis (3, 13, 14) .
  • CD4 binding induces conformational changes in the gpl20 glycoprotein, some of which involve the exposure and/or formation of a binding site for specific chemokine receptors.
  • chemokine receptors mainly CCR5 and CXCR4 for HIV, serve as obligate second receptors for virus entry (15, 16.)
  • the gpl20 third variable (V3) loop is the major determinant of chemokine receptor specificity (17) .
  • V3 loop is the major determinant of chemokine receptor specificity (17) .
  • other more conserved gpl20 structures that are exposed upon engagement of CD4 also appear to be involved in chemokine-receptor binding.
  • CD4-induced exposure is indicated by the enhanced binding of several gpl20 antibodies (18, 19) which, like V3-loop antibodies, efficiently block the binding of gpl20-CD4 complexes to the chemokine receptor (20) . These are called the CD4-induced (CD4i) antibodies.
  • CD4 binding may trigger additional conformational changes in the envelope glycoproteins. For example, the binding of CD4 to the envelope glycoproteins of some HIV-1 isolates induces the release or "shedding" of the gpl20 protein from the complex (21) , although the relevance of this process to HIV entry is uncertain.
  • HIV and related retroviruses belong to a class of enveloped fusogenic viruses that includes corona-, paramyxo- and orthomyxoviruses (e.g. influenza virus), all of which require post-translational cleavage for activation.
  • the transmembrane coat proteins of these viruses (gp41 equivalents) share sequence resemblance, particularly in their N-terminal fusion peptides, and they participate directly in membrane fusion.
  • the ectodomain of gp41 can form a coiled coil resembling that of influenza hemagglutinin HA (23, 4, 22,) supporting the notion that this class of viruses may share some common aspects with respect to virus entry. In other respects, enveloped viruses tend to be distinctive.
  • CD4i epitope A companion report relates this structure to the antigenic properties of the gpl20 envelope proteins (24) .
  • the crystallized gpl20 is from the HXBc2 strain of HIV-1. It has deletions of 52 and 19 residues from the N- and C- termini, respectively; Gly-Ala-Gly tripeptide substitutions for 67 Vl/V2-loop residues and 32 V3-loop residues; and the removal of all sugar groups beyond the linkages between the two core N-acetylglucosamine residues.
  • This deglycosylated core gpl20 eliminates over 90% of the carbohydrate but retains , over 80% of the non-variable-loop protein. Its capacity to interact with CD4 and relevant antibodies is preserved at or near wild-type levels26.
  • the final model, composed of 7877 atoms comprises residues 90-396 and 410-492 of gpl20 (excepting loop substitutions), residues 1-181 of CD4 , and residues 1-213 of the light chain and 1-229 of the heavy chain of the 17b monoclonal antibody.
  • 11 N-acetylglucosamine and 4 fucose residues, and 602 water molecules have been placed.
  • the overall structure of the complex of gpl20 with D1D2 of CD4 and Fab 17b is as depicted in Fig. 8.
  • the deglycosylated core of gpl20 as dissected from the ternary complex approximates a prolate ellipsoid with dimensions of 50 x 50 x 25 ⁇ , although its overall profile is more heart-shaped than circular.
  • Its backbone structure is shown in Figs. 9a & c in an orientation precisely perpendicular to that in Fig. 8 (Fig. He gives a mutually perpendicular view) .
  • This core gpl20 comprises 25 b strands, 5 a helices and 10 defined loop segments, all organized with the topology shown in Fig. 9b. Specific spans of structural elements are given in Fig. 9d.
  • the structure confirms the chemically determined disulfide bridge assignments (6; Fig. 9c) .
  • the polypeptide chain of gpl20 is folded into two major domains plus certain excursions that emanate from this body.
  • the inner domain (inner with respect to the N- and C-termini) features a two-helix, two-strand bundle with a small five-stranded b sandwich at its termini-proximal end and a projection at the distal end from which the V1/V2 stem emanates.
  • the outer domain is a stacked double barrel that lies alongside the inner domain such that the outer barrel and inner bundle axes are approximately parallel.
  • the proximal barrel of the outer-domain stack is composed from a 6-stranded, mixed-directional b sheet that is twisted to embrace helix a2 as a 7th barrel stave.
  • the distal barrel of the stack is a 7-stranded antiparallel b barrel.
  • the two barrels share one contiguous hydrophobic core, and the staves also continue from one barrel to the next except at the domain interface.
  • This interruption is centered at a side between barrels where the chain enters the outer domain with loop LB insinuated as a tongue between strands bl6 and b23.
  • the extended segment just preceding LB is like an 8th stave of the distal barrel, but it is slightly out of reach for hydrogen bonding with its bl6 and b9 neighbors.
  • the chain returns to complete the inner domain after b24.
  • the proximal end of the outer domain includes variable loops V4 and V5 and loops LD and LE, which are variable in sequence as well.
  • Loop LC is also at this end, close in space to loop LA of the inner domain, although by topology it is at the other end of this domain.
  • the distal end does include the stem of the excised variable loop V3 and also an excursion via loop LF into a b hairpin, b20-b21, which in turn hydrogen bonds with the VI/V2 stem emanating from the inner domain.
  • This bridging sheet also participates in the separated interactions of gpl20 with both CD4 and the 17b antibody (Fig. 8 and below) .
  • One further excursion from the body of the outer domain produces strand bl5 and helix a3 , which are also important in CD4 binding.
  • This structure of core gpl20 should be a prototype for the class.
  • FIG. 9d shows that an HIV-2 sequence is 35% identical with that of the HXBc2 strain expressed in this crystallized construct, and the identity level rises to 77% and 51%, respectively, for the more closely related HIV-1 clade C and clade O representatives.
  • the inner domain is appreciably more conserved than the outer domain with 86%, 72% and 45% identity for the respective C, O and HIV-2 comparisons. Variability correlates with the degree of solvent exposure of residues (Fig. 9d) , in keeping with the conservation of hydrophobic cores.
  • the seven disulfide bridges retained in core gpl20 are absolutely conserved and mostly buried (Fig. 9c) .
  • Glycosylation sites are all surface exposed and are conserved above average (Fig. 9d) .
  • the previously identified HIV variable segments ⁇ are all on loops connecting elements of secondary structure, and loops LD and LE are also especially variable. Indeed, LE is more variable than V5 in light of current sequence data. These loops are also relatively mobile as reflected in high B factors or disorder, as in V4.
  • variable segments in the outer domain, including the exposed face of a2 appear to arise from neutral mutation rather than selective pressure since they are on non-immunogenic surfaces, presumably masked by glycosylation.
  • CD4 is bound into a depression formed at the interface of the outer domain with the inner domain and the bridging sheet of gpl20 (Figs. 10a) .
  • This interaction buries a total of 742 A° 2 from CD4 and 802 A° 2 from gpl20.
  • the surface areas that are actually in contact are considerably smaller (Fig. lOd) because an unusual mismatch in surface topography creates large cavities that are occluded in the interface, as described below. There is, however, a general complementarity in electrostatic potential at the surfaces of contact, although the match is imprecise in this respect as well.
  • the focus of CD4 positivity is displaced from the center of greatest negativity on gpl20 (Fig. 10c) .
  • the binding site is devoid of carbohydrate (Fig. lOg) .
  • the structure of CD4 in this complex differs only locally from that in free D1D2 structures and at only a few places : residues 17-20 at the poorly ordered CDRl-like loop and residues 41,42,47,49 and 60, which are at or near the contact site and have low B factors in the gpl20-bound state.
  • Direct interatomic contacts are made between 22 CD4 residues and 26 gpl20 amino-acid residues. These include 219 van der Waals contacts and 12 hydrogen bonds. Residues in contact are concentrated in the span from 25 to 64 of CD4 , but they are distributed over six segments of gpl20 (Figs. 9d & lOi) : 1 residue from the V1/V2 stem, loop LD, the beta-15-alpha-3 excursion, the beta-20-beta- 21 hairpin, strand beta-23 and the beta-24-alpha-5 connection. These interactions are compatible with previous analyses of mutational data on both CD411, 12, 29 and gpl203, 13, 14.
  • Lys 29 makes a direct ionic hydrogen bond, and while Asp 457 of gpl20 is near to these electropositive groups (Figs. lOe & i) it does not make hydrogen bonds .
  • gpl20 residues that are covered by CD4 are variable in sequence. This variation is accommodated in part by the large interfacial cavity (Fig. lOe) .
  • the gpl20 residues in contact with this water-filled cavity are especially variable (Fig. lOg) .
  • half of the gpl20 residues that make contacts with CD4 do so only through main-chain atoms (including Cb) of gpl20, and 60% of CD4 contacts are made by gpl20 main-chain atoms (Fig. lOf) . Included among these are 5 of the 12 hydrogen bonds in the interface.
  • One such contributing element is an antiparallel b-sheet alignment of CD4 strand C" with gpl20 strand beta-15 (Figs. 10a & i) .
  • Atomic details of the interaction are particularly intricate and unusual for the contacts made between gpl20 and the mutationally critical CD4 residues Phe 43 and Arg 59 (Fig. lOj).
  • Arg 59 interacts with Asp 368 and Val 430.
  • the carboxylate group of Asp 368 makes double hydrogen bonds with the guanidinium Nh atoms of Arg 59, but it also hydrogen bonds back to the backbone NH group of residue 44 and it appears to be optimally positioned to receive a CH...0 hydrogen bond (3.20 A°) from the Phe 43 ring.
  • Phe 43 interacts with residues Glu 370, He 371, Asn 425, Met 426, Trp 427 and Gly 473 as well as Asp 368, but only the contacts with He 371 have a conventional hydrophobic character. Those to 425-427 and 473, including Trp 427, are only to backbone atoms. A surprisingly large fraction of the Phe 43 contacts (28%) are to polar groups. The phenyl group is stacked on the carboxylate group of Glu 370, and there are contacts with the carbonyl oxygen atoms of residues 425, 426 and 473 and the NH group of Trp 427.
  • the larger cavity is lined by mostly hydrophilic residues, half derived from gpl20 and half from CD4. It is not deeply buried; while formally a cavity in the crystal structure, minor changes in sidechain orientation would make it solvent accessible.
  • the observed electron density and predicted hydrogen bonding are consistent with at least 8 water molecules in the cavity.
  • Residues from gpl20 that actually line the cavity include Ala 281, Ser 364, Ser 365, Thr 455, Arg 469) exhibit sequence variability, whereas surrounding this variable patch are conserved residues, the substitution of which affect CD4 binding. These include the critical contact residues Asp 368, Glu 370 and Trp 427, which flank one end of the cavity, and Asp 457 at the other end (Fig. lOe) .
  • CD4 residues that line the cavity can be mutated with only moderate effect on gpl20 binding, whereas Arg 59 suffers less loss of solvent accessible surface upon gpl2_0 binding but is highly sensitive to mutation.
  • This cavity thus serves as a water buffer between gpl20 and CD4 (Fig. lOe) .
  • the tolerance for variation in the gpl20 surface associated with this cavity produces a variational island (Fig. lOg) , or "anti-hot spot", which is centrally located between regions required for CD4 binding, and may help the virus escape from antibodies directed against the CD4 binding site.
  • the "Phe 43" cavity (Fig. 10b & h) is very different in character from the larger binding- interface cavity. It is roughly spherical, with a diameter of ⁇ 8 A° (atom center to atom center) across the center of the cavity. It is positioned just beyond Phe 43 of CD4 , at the intersection of the inner domain, the outer domain and the bridging sheet. It is relatively deeply buried, extending into the hydrophobic interior of gpl20. The phenyl ring of Phe 43 is the only non-gpl20 residue contacting this cavity, forming a lid which covers the bottom of the cavity (Fig. 10b) .
  • Residues that line the Phe 43 cavity are primarily hydrophobic. They are also highly conserved, as much so as the buried gpl20 hydrophobic core. Despite a lack of steric hindrance, almost no substitutions to larger residues are found. Given the frequency of gpl20 sequence divergence, such conservation strongly implies functional significance. Indeed, although residues that line this cavity provide little direct contact to CD4 , they do nevertheless affect the gpl20-CD4 interaction. Thus, mutations at Thr 257 (no contacts) and Trp 427
  • the 17b antibody is a broadly neutralizing human monoclonal isolated from the blood of an HIV-infected individual. It binds to a CD4-induced (CD4i) gpl20 epitope that overlaps the chemokine receptor-binding site20.
  • the interface between Fab 17b and core gpl20 in the ternary complex involves a small area of interaction.
  • the solvent accessible area excluded upon binding is only 455 A° 2 from gpl20 and 445 A° 2 from 17b, which is largely from the heavy chain (371A° 2 ) .
  • the long (15 residue) complementarity-determining region 3 (CDR3) of the heavy chain dominates, but the heavy-chain CDR2 and the light -chain CDR3 also contribute.
  • the 17b contact surface is very acidic (3 Asp, 3 Glu, no Arg or Lys) although hydrophobic contacts (notably a cis proline and tryptophan from the light chain) predominate at the center.
  • the 17b epitope lies across the base of the four-stranded bridging sheet (Fig. He & e) . All four strands make substantial contact with 17b, suggesting that the integrity of the bridging sheet is necessary for 17b binding.
  • the gpl20 surface that contacts 17b consists of a hydrophobic center surrounded by a highly basic periphery (3 Lys, 1 Arg, and no Asp or Glu) (Fig. lid) . Although this basic gpl20 surface complements the acidic 17b surface, only one salt bridge is observed (between Arg 419 of gpl20 and Glu 106 of the 17b heavy chain) . The rest of the specific contacts occur between hydrophobic and polar residues.
  • the interaction between 17b and gpl20 involves a hydrophobic central region flanked on the periphery by charged regions, predominately acidic on 17b and basic on gpl20.
  • CD4-17b contacts There are no direct CD4-17b contacts and none of the gpl20 residues contacts both 17b and CD4. Rather, CD4 binds on the opposite face of the bridging sheet, providing specific contacts that appear to stabilize its conformation (Fig. lOi and lOj) and may explain in part the CD4-induction of 17b binding.
  • the 17b epitope is well conserved among HIV-1 isolates. Of the 18 residues that show loss in solvent accessible surface upon contact with 17b, 12 residues (67%) are conserved among all HIV-1 viruses. By contrast, only 19 of the 37 gpl20 residues (51%) that show loss of solvent accessible surface upon CD4 binding are similarly conserved.
  • CD4i epitopes tend to be masked from immune surveillance by the adjacent V2 and V3 loops (see accompanying paper) . Indeed, in the complex structure, a large gap is seen between gpl20 and tips of the light-chain CDR1 and CDR2 loops. Pointing directly at this gap is the base of the V3 loop.
  • variable loops may need to be bypassed for access to the conserved structures in the bridging sheet .
  • the 17b epitope may be further protected from the immune system by a CD4- induced conformational change (see below) .
  • the site of interaction with the chemokine receptor CCR5 overlaps with the 17b epitope30. Both are induced upon CD4 binding and both involve highly conserved residues.
  • the basic and polar gpl20 residues (Lys 121, Arg 419, Lys 421, Gin 422) that contact the 17b heavy chain also are important for CCR5 interaction30.
  • the hydrophobic and acidic surface of the 17b heavy chain may mimic the tyrosine-rich, acidic N-terminal region of CCR5 , which is important for gpl20 binding and HIV-1 entry (31, 32) .
  • this site is directed at the cellular membrane when gpl20 is engaged by CD4. Electrostatic interactions between the basic surface of the bridging sheet and the acidic chemokine receptor (and possibly the acid headgroups in the target membrane) could drive conformational changes related to virus entry.
  • gpl20 Although monomeric in isolation, gpl20 likely exists as a trimeric complex with gp41 on the virion surface.
  • the large electroneutral surface on the inner domain (Fig. 10c) is the probable site of trimer packing based on its lack of glycosylation, its conservation in sequence, the location of CD4 and CCR5 binding sites, and the immune response to this region.
  • the Phe 43 cavity (now a pocket) would present a perplexing structural dilemma.
  • the cavity-lining residues have few structural restrictions, with ample room for larger substitutions into the cavity, yet these residues are highly conserved and inexplicably hydrophobic if exposed in a pocket .
  • This pocket structure is in turn intimately connected to the bridging sheet, itself peculiar in absence of CD4.
  • the backbone amide of bridging-sheet residue 425 is hydrogen-bonded to Glu 370, a critical CD4 contact residue (Fig.
  • Trp 427 packs perpendicular to Trp 112, which lines the pocket from the inner domain (Fig. 10b) .
  • NS of Trp 427 is delicately poised for hydrogen-bonding with the 7r-electrons of the indole ring of Trp 112. Structures such as these would necessarily be very sensitive to orientational shifts between the inner and outer domains .
  • core gpl20 may differ in the absence of CD4 comes from comparison with theory.
  • the evolutionary algorithm of PHD37 gives secondary-structure predictions with 90% estimated reliability for roughly 45% of the core gpl20 sequence. Compared to our structure, it is accurate except at three places where it is markedly wrong (four consecutive residues with reliability index greater than 90%) . All of these are at the Phe 43 cavity or in contacts with CD4 : loop LB, strand 315, and the segment of 320 into the turn to /321.
  • Fig. 10c stabilizes a nascent complex state, and inserts the Phe 43 to induce formation of the Phe 43 cavity.
  • the HIV surface proteins function to fuse the viral membrane with the target cell membrane.
  • the gpl20 glycoprotein plays roles crucial to the control and initiation of fusion.
  • One set of roles concerns positioning: locating a cell capable of productive viral infection, anchoring the virus to the cell surface, and orienting the viral spike next to the target membrane.
  • Another set concerns timing: holding the gp41 in a metastable conformation and triggering the coordinate release of the three N-terminal fusion peptides of the trimeric gp41. While it is clear that this is a complex multi-conformational process, the simplicity of the system, composed only of two membranes, the viral oligomer, and two host receptors, raises the possibility that we may be able to understand the entire mechanism.
  • Crystallography has now provided two snapshots : an intermediate state in which gpl20 is bound to CD4 , described herein; and a probably final, "fusion-active" state of the gp41 ectodomain (40,41) .
  • gp41 ectodomain 40,41
  • the entry process is initiated by the binding of HIV-1 to the cellular receptor CD4 (Fig. 12, step 1). Although the extracellular portion of CD4 has some segmental flexibility, this binding roughly orients the viral spike.
  • This orientation can be simulated by an alignment of the D1D2 CD4 in the ternary complex with the previously solved structure of the four-domain, entire extracellular portion of CD4(10) . Such alignment orients the N- and C- termini of core gpl20 towards the viral membrane, while the 17b epitope/chemokine receptor-binding site on the gpl20 surface faces the target cell membrane.
  • Such an orientation is consistent with the proposed oligomeric structure and gp41-interactive surfaces described above.
  • CD4 binding also induces conformational changes in gpl20, which result in the creation of a metastable oligomer. Although some of the more flexible gpl20 regions and gp41 are missing, the structure of the core gpl20-CD4 complex presented here describes this state in atomic detail . CD4 binding results in movement of the V2 loop, which numerous experiments suggest partially occludes the V3 loop and CD4i epitopes (18, 36) . It also creates, or at least stabilizes, the bridging sheet on which these epitopes are located (described above for the core) .
  • CD4 binding results in changes in the conformation of the V3 region, with the tip of the loop becoming more accessible, as judged by enhanced proteolytic susceptibility and altered exposure of V3 epitopes (19) .
  • the V3 loop together with the uncovered epitopes comprise the chemokine-receptor binding site.
  • CD4 binding not only orients the gpl20 surface implicated in chemokine receptor binding to face the target cell, but it also forms and exposes the site itself.
  • these changes may all result from a single, concerted shift in the relative orientation of the inner and outer domains.
  • This conformational shift may alter the orientation of the N- and C- termini, at the proximal end of the inner domain, perhaps partially destabilizing the oligomeric gpl20/gp41 interface (21) .
  • Such a shift would also alter the relative placement of the V1/V2 stem (in the CD4i site) , which emanates from the inner domain, and the V3 loop, which emanates from the outer domain.
  • mutations that permit an adaptation of HIV-1 to CD4-independent entry using CXCR4 involve sequence changes in both the VI/V2 stem and the V3 loop (42) .
  • the next step in HIV-1 entry is the interaction of the gpl20-CD4 complex with the chemokine receptor (Fig. 12, step 2) .
  • chemokine receptor Fig. 12, step 2
  • interactions between CD4 and chemokine receptor may occur, mutagenic analyses (H. Choe and J. Sodroski, unpublished observations) and the known examples of CD4-independent virus entry or chemokine-receptor binding suggest that direct gpl20 contacts dominate in the interaction with the chemokine receptor. Since most of the chemokine receptor is encased in the host membrane, binding would necessarily move the gpl20 bridging sheet close to the target membrane. This movement requires CD4 flexibility since the initial HIV binding at the N-terminal DI domains probably occurs above the glycocalyx.
  • the structure of the gpl20/CD4/l7b antibody ternary complex described here reveals some of the molecular aspects of HIV-1 entry, including the atomic structure of gpl20, the explicit interactions with CD4 , and the conserved site of binding for the chemokine receptor. Still unknown are details of the apo state of core gpl20, the oligomeric structure, the interaction with the chemokine receptor, the conformational changes that trigger the reorganization of the gp41 ectodomain and the structural basis for insertion of the fusion peptide of gp41 into the target membrane. Further understanding will require snapshots of other intermediates.
  • the conformational complexity and observed intricate domain associations of gpl20 may reflect genome restrictions at the protein level akin to those that lead to overlapping reading frames at the transcription level. Multiply protected infection machinery is contained in these condensed intricacies. Its mechanisms frustrate host defenses; understanding them may inspire medical intervention.
  • the two-domain CD4 (D1D2, residues 1-182) was produced in Chinese hamster ovarian cells (8), the monoclonal antibody 17b in an Epstein-Barr virus immortalized B-cell clone isolated from an HIV-1 infected individual and fused with a murine B-cell fusion partner(18), and the core gpl20 from Drosophila Schneider 2 lines under control of an inducible metallothionein promoter (20) .
  • the various biochemical manipulations e.g. deglycosylation for the gpl20 and papain digestion to produced the Fab 17b
  • protein purification e.g. ternary complex crystallization are described elsewhere (25) .
  • the best crystals were small needles of cross-section only 30-40 ⁇ m. These were crosslinked with vapor diffusion glutaraldehyde treatment
  • cryoprotectant containing stabilizer (10% ethylene glycol with 10.5% monomethyl-PEG 5,000, 10% isopropanol, 50 mM NaCl, 100 mM Citrate/HEPES buffer pH 6.3), transferred into immiscible oil (Paratone-N; Exxon) , suspended in a small ethylene loop at the end of a mounting pin, and flash-frozen in a cryostat nitrogen stream at 100 K .
  • Diffraction data were collected at beamline X4A, Brookhaven National Laboratory, using phosphor image plates and a Fuji BAS2000 scanner. To avoid overlap problems from the relatively high mosaicity (-1.0°), oscillation data were collected using a rotation axis that was off-set at least 30° from the 197A c axis. Although crystals initially diffracted to Bragg spacing of greater than 2A, ⁇ axis mosaicity and substantial radiation damage despite cryogenic cooling reduced the overall resolution to 2.5A. Data processing and reduction were performed using DENZO and SCALEPACK (45) (Table 1) .
  • each of the top 100 possible rotational solutions with each of three different CD4 models (lcdi, lcdh, 3cd4) , were searched for a distinctive translation solution (AMoRe; J. Navaza) .
  • the translation searches used the rigid body refined Fab as a partial structure to help discriminate the correct solution.
  • Two distinctive solutions were found: the 25th rotational solution of 3cd4 gave a translation correlation of 0.171 (verses 0.128 for the second highest translation solution) , and the 61st rotational solution of lcdh gave 0.149 (verses 0.140). These two solutions were virtually identical .
  • Rigid body refinement in XPLOR(46) gave a Patterson correlation of 7.9% for the CD4 alone and 32.4% for the Fab and CD4. All molecular replacement and rigid body refinements used 8-4A data.
  • crystals were soaked in over 20 different heavy atom solutions and screened for isomorphous replacement using the statistical ⁇ chi>2 test in SCALEPACK (45) .
  • Derivatives were identified from two heavy atom compounds : 10 mM K3IrCl6 (10 hr equilibration in heavy atom containing cryoprotectant stabilizer; 2.8A) and 5 mM K20sCl6 (24 hr soak; 3.5A) .
  • Isomorphism was found to be highest between these heavy atom data sets and a native data set collected at pH 7.0 (cryoprotectant stabilizer buffered with 50 mM BisTris pH 7.0) .
  • K3IrCl6 derivative was modeled as 9 partially occupied sites; two sites of occupancy 0.158 and 0.142, and 7 of less than 0.07. While relatively isomorphous, poor data quality (Rsym of greater than 20% past 3.0A) combined with relatively small isomorphous differences (Riso of 12.0%) reduced the quality of phasing. In contrast, the K20sC16 derivative had an Riso of 15.6%, but was only isomorphous to roughly 5A. It was modeled as 4 sites of occupancy 0.321, 0.207, 0.194 and 0.128, with the highest site at the same position as the second highest site from K3IrC16.
  • Deviations of the CD4 structure in the complex from the free state were measured by the procedure of Wu et al.10. Deviations were taken as significant when the root mean square (rms) residue deviation was greater than the overall value and also more than 0.5u greater than variation among the free structures .
  • Interatomic contacts were defined as in Zhu et . al.48. Structural alignments were made by visual comparison of the SCOP databas, and automatic searches were performed with PrISM (A.-S. Yang and B. Honig) .
  • HIV-1 entry co-factor functional cDNA cloning of a seven-transmembrane, G protein-coupled receptor. Science 272, 872-877 (1996) .
  • chemokine receptors as human immunodeficiency virus type 1 coreceptors determined by individual amino acids in the envelope V3 loop. J. Virol. 71, 7136-7139 (1997) .
  • HIV-1 gpl20 glycoproteins with the chemokine receptor CCR-5 Nature 384, 179-183 (1996).

Abstract

This invention provides a method of inhibiting the interaction of HIV-gp120 with leukocyte CD4 which comprises administering to a mammal in need thereof a compound which contains certain functional groups that interact with HIV-gp120 in a manner that disrupts those interactions.

Description

COMPOUNDS INHIBITING CD4-αpl20 INTERACTION AND USES THEREOF
This application is a contiunation-in-part application of U.S. Serial No. 09/100,764, filed June 18, 1998, which is a continuation-in-part of U.S. Serial No. 08/967,708, filed November 10, 1997. The content of these two applications are hereby incorporated by reference into this application.
The invention disclosed herein was made with the United States Government support under National Institute of Health Grant Nos. Al 31783, Al 39420, Al 28691, CA 06516, Al 41851, and Al 40895. Accordingly, the United States Government has certain rights in this invention.
Various references are referred to within this application. Disclosures of these publications in their entireties are hereby incorporated by reference into this application to more fully describe the state of the art to which this invention pertains.
Background of the Invention
During the first thirty years of protein crystallization, the standard conceptual practice was to treat the protein as a fixed constant and screen it through a multitude of crystallization conditions. Advances in this approach has led to the development of crystallization robots capable of testing thousands of conditions (1,2). While this approach has had success, it fails for many interesting proteins.
One of these is the Human Immunodeficiency Virus (HIV) -1 envelope glycoprotein, gpl20. HIV induces acquired immunodeficiency syndrome (AIDS) in humans (3,4). The gpl20 glycoprotein helps to mediate virus entry into cells through sequential recognition of two cellular receptors of the human host, CD4 (5,6), and a chemokine receptor (primarily CXCR-4 or CCR-5, depending on viral strain) (7-12) . These high affinity interactions are attractive targets for mimetic drug design. Although the structure of the gpl20-binding domain of CD4 and the identity of residues critical to its interaction with gpl20 have been known for several years (13,14), this has not been sufficient for design of potent antagonists (15-17) . As the major virus-specific antigen accessible to neutralizing antibodies, knowledge of the gpl20 structure could also impact considerably on vaccine design.
The gpl20 protein has been an obvious target for structural investigation, and quantities of pure soluble protein have been available for several years, a byproduct in part from vaccine trials. Nevertheless, despite considerable effort, it has resisted crystallographic analysis for more than a decade.
The mature gpl20 glycoproteins of different HIV-1 strains have approximately 470-490 amino acids (18) . Extensive N-linked glycosylation at approximately 20-25 sites accounts for roughly half its mass (18,19) . Sequences from many different viral isolates show that it contains five conserved regions (C1-C5) and five variable regions (V1-V5) (18, 20) and nine conserved disulfide bridges (19) . Except for limited N- and C- terminal cleavage, proteolytic digestion does not reveal a sub-domain structure. Indeed, even after extensive proteolytic cleavage, the unreduced protein runs near its native molecular weight on SDS-PAGE (Peter D. Kwong : unpublished data) . Some of the variable regions, the V3 loop in particular, appear to be conformationally variable. Conformational change is also evidenced by shedding, the CD4-induced dissociation of gpl20 from the surface of the virus, and by ligand- induced variations in monoclonal antibody binding (21,22) . These changes may be related to the functional role of gpl20 in virus entry .
The extensive glycosylation and conformational heterogeneity of gpl20 suggested that merely screening the protein through ever more exotic crystallization conditions would not produce well-diffracting crystals. We therefore adopted a fundamentally different approach, which we term variational crystallization. This approach employed on radical modification of the protein surface, primarily to reduce heterogeneity, but also as a means of varying potential crystallization lattice contacts. An interactive cycle, involving different biochemical and molecular biological techniques, was used to detect and remove chemical and conformational heterogeneity. In addition, protein ligands, such as CD4 and the Fabs of monoclonal antibodies, were used to restrict conformational mobility. Progressive trials of 18 different gpl20 crystallization variants yielded six different crystals. This paradigm of crystallization, with a focus on protein modification rather than on crystallization screening, may aid in the structural analysis of other conformationally complex proteins.
Summary of the Invention
This invention provides a method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammal in need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupt two or more of the following interactions:
a . an alkyl group, R, aromatic or heteraromatic group, Het, that binds to the side chain isobutyl (isopropyl or methyl group) of gpl20 isoleucine (valine or alanine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl (isopropyl or methyl group) of gpl20 isoleucine (valine, or alanine) 371 and CD4 phenylalanine 43;
b. an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43, wherein Het is phenyl, Bn,
EtPh, or heteroarylalkyl
c. a group X that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43, wherein X is hydroxyalkyl, hydroxyaryl , alkylamide, or arylamide;
d. an aromatic group or heteroaromatic group, Het, that binds to the side chain indole group of gpl20 tryptophan 427 or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
e. an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
f. an aromatic group or heteroaromatic group, Het, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
g. an alkyl group, R, that binds to the beta or gamma carbons of the side chain propionate of gpl20 gluta ic acid 370 or disrupts the hydrophobic interaction between the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43, wherein
R is alkyl, cycloalkyl, or haloalkyl;
h. an aromatic group of heteroaromatic group,
Het , that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
a group X that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of gpl20 asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
j . a group Y that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 ky 1 ammo n i um , a r y 1 amm o n i urn , arylalkylammonium, alkylguanidiniu , piperidinium, pyrollidinium, or pyridinium;
k. an alkyl group, R, aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of gpl20 valine (or alanine) 430 or disrupts the hydrophobic interaction between the side chain isopropyl
(or methyl) group of valine (or alanine) 430 and the side 'chain guanidinium group of CD4 arginine 59;
1. a group, Z, that binds to the side chain propylalcohol group of gpl20 threonine 123 or disrupts the side chain propylalcohol group of threonine 123 and the hydrogen bond interaction between the alpha carbonyl group of CD4 Arg 59, wherein Z is alkoxyalkyl , aryloxyalkyl , alkoxyaryl , haloalkyl , haloaryl , alkylamide, arylamide, alkylcarboxylate, arylcarboxylate, arylalkyl ester, dialky ester, or alkylarl ester.
This invention also provides a method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammalin need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupt two or more of the following interactions;
a. an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43;
b. an aromatic group of heteroaromatic group, Het, that binds to the alpha, beta or gamma carbons fo the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43;
c. an aromatic group or heteroaromatic group, Het, that binds to the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 and CD4 phenylalanine 43;
d. an aromatic group or heteroaromatic group, Het , that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
e. an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 tryptophan 427 or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
f . an aromatic group or heteroaromatic group,
Het, that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine
43;
g. an aromatic group or heteroaromatic group,
Het, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
h. a group Y that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 k y 1 a mm o n i u m , a r y 1 a mmo n i u , arylalkylammonium , alkylguanidinium , piperidinium, pyrollidinium, or pyridinium;
i. an alkyl group, R, aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of ghl20 valine
(or alanine) 430 or disrupts eh hydrophobic interaction between the side chain isopropyl (or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59;
j . a group, Z, that binds to the side chain propylalcohol group of gpl20 threonine 123 or disrupts the side chain propylalcohol group of threonine 123 and the hydrogen bond interaction between the alpha carbonyl group CD4 Arg 59; and/or
k. a group, X, that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 with the side chain amide group of CD4 glutamine 40;
1. a group, Z, that binds tot he alpha amino group of gpl20 glycine (alanine, or glutamic acid) 472 at a distance of 3.6 ang . , or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (alanine, or glutamic acid) 473 with the side chain propionamide group of CD4 glutamine 40;
m. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the side chain propionamide group of CD4 glutamine 40;
n. a group, Z, that binds to- - the alpha amino group of gpl20 aspartic acid (or asparagine)
474 at a distance of 3.4 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the alpha carbonyl group of CD4 glutamine 40;
o. a group, X, that binds to the alpha carbonyl group of gpl20 methionine (or serine) 426 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 methionine (or serine) 426 and the side chain hydroxyl group of CD4 serine 42;
p. a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the alpha amino group of CD4 serine 42;
q. a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the side chain hydroxyl group of CD4 serine 42;
r. a group, X, that binds to the alpha amino group of gpl20 lysine 429 or dsrupts the hydrogen bond interaction between the alpha amino group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
s. a group, X, that binds to the alpha carbonyl group of gpl20 lysine or disrupts the hydrogen bond interactio between the alpha carbonyl group of gpl20 lysine (threonine or asparagine) 429 and the side chain' hydroxyl group of CD4 serine 42;
t. a group, X, that binds to the alpha amino group of gpl20 valine (or alanine) 430 or disrupts the hydrogen bond between the alpha amino group of gpl20 valine (or alanine) 430 and the side chain hydroxyl group of CD4 serine 42;
u. a group, X, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine 473 and the alpha amino group of CD4 serine 42;
v. a group, X or Y, that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the side chain carboxyl group of gpl20 aspartic acid 368 and the alpha amino group of CD4 leucine 44;
w. a group, X, that binds to the alpha amino group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid 368 and the alpha carbonyl group CD4 leucine 44;
an alkyl group, R, that binds to the isobutyl (or isopropyl) group of gpl20 isoleucine (or valine) 271 or disurpts the hydrophobic interaction between the isobutyl group of gpl20 isoleucine (or valine) 271 and the side chain hydroxypropl group of CD4 threonine 45;
y. a group, X, that binds to the alpha amino group of gpl20 glycine 366 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 366 and the alpha carbonyl group of CD4 lysine 46;
z. a group, X, that binds to the alpha carbonyl group ofgpl20 glycine 366 or disrupts the hydrogen bond interaction between the alpha carbonyl group of gpl20 glycine 366 and the alpha amino group of CD4 lysine 46;
al . a group, X, that binds to the alpha amino group of gpl20 clycine 367 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 367 and the alpha carbonyl group of CD4 lysine 46;
bl . a group, Y, that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang., or disrupts the hydrogen bond interaction between the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or seine) 280 and the side chain butylammoniu group of CD4 lysine 29;
cl. a group, Q, that binds to the alpha methylene
(or methine) group of gpl20 glycine (or valine) 459 at a distance of 3.1 ang., or disrupts the dipolar interaction between the alpha methylene (or methine) group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32, wherein Q is dialkylketone , alkylarylketone , or arylalkylketone ;
dl . a group, Z, that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang., or disrupts the hydrogen bond interaction betweeen the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
el. a group, X, that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the alpha carbonyl group of CD4 glutamine 33;
fl. a group, X, that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the side chain amide group of CD4 glutamine 33;
gl . a group, X, that binds to the alpha amino group of gpl20 glycine (or valine) 459 or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (or valine) 459 with the side chain propiona ido group of CD4 glutamine 33;
hi. a group, X, that binds to the alpha carbonyl group of gpl20 serine (or alanine) 365 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 serine (or alanine) 365 with the side chain amide of CD4 asparagine 52; and/or
il. a group, X, that binds to the side chain isopropylalcohol group of gpl20 threonine 123 at a distance of 3.8 ang., or disrupts the hydrogen bond between the side chain isopropyl alcohol group of gpl20 threonine 123 with the side chain ethyl alcohol group of CD4 serine 60.
This invention also provides a Method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammal in need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupts two or more of the following interactions:
a. an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43 ;
b. an aromatic group or heteroaromatic group, Het, that binds to the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43; c. an aromatic group or heteroaromatic group, Het, that binds to the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl
(or isopropyl group) of gpl20 isoleucine (or valine) 371 and CD4 phenylalanine 43;
d. an aromatic group or heteroaromatic group, Het, that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
e. an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 tryptophan 427 or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
f . an aromatic group or heteroaromatic group,
Het, that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine
43;
g. an aromatic group or heteroaromatic group, Het, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
h. a group, Y, that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 k y 1 a mm o n i u m , a r y 1 a mm o n i u , arylalkylammonium , alkylguanidinium, piperidinium, pyrollidinium, or pyridinium.
i. an alkyl group, R, aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of gpl20 valine
(or alanine) 430 or disrupts the hydrophobic interaction between the side chain isopropyl (or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59;
j . a group, Z, that binds to the side chain propylalcohol group of gpl20 threonine 123 or disrupts the side chain propylalcohol group of threonine 123 and the hydrogen bond interaction between the alpha carbonyl group of CD4 Arg 59; and/or
k. a group, X, that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 with the side chain amide group of CD4 glutamine 40;
1. a group, Z, that binds tot healpha amino group of gpl20 glycine (alanine, or glutamic acid) 472 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (alanine, or glutamic acid) 473 with the side chain propionamide group of CD4 glutamine 40;
m. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the side chain propionamide group of CD4 glutamine 40;
n. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.4 ang., or disrupts the hydrogen bond betweenthe alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the alpha carbonyl group of CD4 glutamine 40;
o. a group, X, that binds to the alpha carbonyl group of gpl20 methionine (or serine) 426 or disrupts the hydrogen bond betweenthe alpha carbonyl group of gpl20 methionine (or serine) 426 and the side chain hydroxyl group of CD4 serine 42;
p. a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the alpha amino group of CD4 serine 42;
q. a group, X, that binds tot he alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the side chain hydroxyl group of CD4 serine 42;
r. a group, X, that binds to the alpha amino group of gpl20 lysine 429 or disrupts the hydrogen bond interaction betweeen the alpha amino group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
s. a group, X, that binds to the alpha carbonyl group of gpl20 lysin or disrupts the hydrogen bond interaction between the alpha carbonyl group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
t. a group, X, that binds to the alpha amino group of gpl20 valine (or alanine) 430 or disrupts the hydrogen bond between the alpha amimo group of gpl20 valine (or alanine) 430 and the side chain hydroxyl group of CD4 serine 42;
u. a group, X, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine 473 and the alpha amino group of CD4 serine 42;
v. a group, X or Y, that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the side chain carboxyl group of gpl20 aspartic acid
368 and the alpha amino group of CD4 leucine 44; w. a group, X that binds to the alpha amino group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid 368 and the alpha carbonyl group of CD4 leucine 44 ;
x. an alkyl group R that binds to the isobutl (or isopropyl) group of gpl20 isoleucine (or valine) 271 or disrupts the hydrophobic interaction between the isobutyl group of gpl20 isoleucine (or valine) 271 and the side chain hydroxypropyl group of CD4 threonine 45;
y. a group, X, that binds to the alpha amino group of gpl20 glycine 366 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 366 and the alpha carbonyl group of CD4 lysine 46;
z. a group, X, that binds to the alpha carbonyl group of gpl20 glycine 366 or disrupts the hydrogen bond interaction betweeen the alpha carbonyl group of gpl20 glycine 366 and the alpha amino group of CD4 lysine 46;
al . a group, X, that binds to the alpha amino group of gpl20 glycine 367 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 367 and the alpha carbonyl group of CD4 lysine 46;
bl . a group, Y, that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang., or disrupts the hydrogen bond interaction between the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 and the side chain butylammonium group of CD4 lysine 29;
cl . a group, Q, that binds to the alpha methylene (or methine) group of gpl20 glycine (or valine) 459 at a distance of 3.1 ang., or disrupts the dipolar interaction between the alpha methylene (or methine) group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
dl . a group, Z, that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
el. a group, X, that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the alpha carbonyl group of CD4 glutamine 33;
fl. a group, X, that binds to the side chain amide
(or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the side chain amide group of CD4 glutamine 33;
gl . a group, X, that binds to the alpha amino group of gpl20 glycine (or valine) 459 or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (or valine) 459 with the side chain propionamido group of CD4 glutamine 33;
hi. a group, X, that binds to the alpha carbonyl group of gpl20 serine (or alanine) 365 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 serine (or' alanine) 365 with the side chain amide of CD4 asparagine 52; and/or
il . a group, X, that binds to the side chain isopropylalcohol group of gpl20 threonine 123 at a distance of 3.8 ang., or disrupts the hydrogen bond between the side chain isopropyl alcohol group of gpl20 threonine 123 with the side chain ethyl alcohol group of CD4 serine 50.
jl. an alkyl group, R, or an aromatic or heteroaromatic group that binds to the side chain indole group of gpl20 tryptophan (or phenylalanine) 112 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
kl . an alkyl group, R, or an aromatic or heteroaromatic group that binds to the side chain phenyl group of gpl20 phenylalanine 382 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine
43;
ll. an aromatic or heteroaromatic group that binds to the side chain phenolic group of gpl20 tyrosine 384 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43; ml . an alkyl group, R, that binds to the side chain alkyl group of gpl20 valine (isoleucine, or glutamine) 255 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
nl . a group, X, that binds to the side chain hydroxyl group of gpl20 threonine 257 and/or disurpts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
ol . a group, Y, that binds to the side chain carboxyl group of gpl20 glutamic acid 370 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine
43;
pi. an alkyl group, R, that binds to the side chain isobutyl group of gpl20 isoleucine 424 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
ql . an aromatic or heteroaromatic group that binds to the side chain indole group of gpl20 tryptophan 427 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43; and/or
rl . an aromatic or heteroaromatic group that binds to the side chain phenolic group of gpl20 tyrosine 435 and/or disrupts the afroementioned inteactins of gpl20 with CD4 phenylalanine 43. Brief Description of the Ficnires
Figures for the First Series of Experiments
Figure 1
Computer-generated ribbon drawing of the tertiary structure of CD4 , gpl20, and Fab 17b interacting. CD4 is in the top left, gpl20 is toward the right, and Fab 17b is in the bottom left of the figure.
Figure 2
Sketch providing a simplified illustration of the locations of CD4 , gpl20, and Fab 17b in the computer- generated ribbon drawing of Figure 1.
Figure 3
Photomicrographs of crystals containing HIV-1 gpl20. Crystal types A-F are shown and correspond to the crystal types described in the text and Tables 3 and 4. The photomicrograph in A is at twice the magnification. The bar in A corresponds to 25 μm (50 μm for B-F) .
Figure 4
Polyacrylamide gel electrophoresis (PAGE) of the ternary complex crystals (Type E) . A cluster of crystals
(0.4x0.1x0.05mM) was washed four times with 1 μl of reservoir solution and dissolved in 3 μl of loading buffer and analyzed by SDS-PAGE on a 8-25% gradient gel
(Pharmacia Phast system) . Lane 1, 2.5 ug of ternary complex purified by gel filtration. The top band is the deglycosylated Δ82ΔVl/2*ΔV3ΔC5 gpl20, the next two bands are the alkylated and reduced heavy and light chains respectively of the Fab 17b, and the bottom band is the two-domain sCD4 (D1D2). Lane 2, standards: 94, 67, 43 (diffuse), 30, 20, and 14. Lane 3, supernant from the crystallization droplet. Lane 4, last wash of crystals. Lane 5, dissolved crystals. The gel is silver stained. Figure 5
Computer-generated ribbon drawing of the tertiary structure of CD4 and gpl20 interacting. CD4 is toward the bottom and gpl20 is toward the top.
Figures for the Second Series of Experiments
Figures 6A and 6B The HIV-1 entry process. The trimeric HIV-1 envelope glycoproteins , anchored in the viral membrane, are depicted, with gpl20 in the lower right and gp41 in the upper right. For simplicity, the gpl20 variable loops are not shown, but would extend over the outer surface of the envelope glycoprotein complex. The receptors on the target cell, CD4 and chemokine receptor, are also shown. The structures of gpl20, gp41, and CD4 are adapted from available X-ray crystallographic studies (5,20,21), whereas the chemokine receptor model is hypothetical.
Figure 7
The HIV-1 gpl20 surface
Figure 7A
The molecular surface of the HIV-1 gpl20 core (20) is shown, with the arrow pointing towards the viral membrane. The inner domain, believed to interact with gp41, and the outer domain, which is probably exposed on the assembled trimer, are on the left and right, respectively. The gpl20 surface occluded by CD4 is shown and the gpl20 region thought to be involved in chemokine receptor binding (27) is also shown. The location of the base of the V3 loop is shown.
Figure 7B Conserved gpl20 neutralization epitopes are shown on the gpl20 core, which is oriented identically to that in Figure 7A. The location of the epitopes was deduced from mutagenic analysis (45,46,48) .
Figure 7C
The approximate location of gpl20 structures (20) that contribute to protection from antibody responses is shown. The major variable loops (V2, V3 , and V4) , the V5 region and the sites of N- linked glycosylation are shown .
Figure 7D
The relationship of different surfaces of the gpl20 core to the antibody response generated by the gpl20 glycoprotein is depicted. The surface of gpl20 that interacts with neutralizing antibodies (32) is shown, spans the inner and outer domains, and includes the V2 and V3 variable loops (not shown) . The surface of gpl20 that interacts with non-neutralizing antibodies is located on the inner domain, and includes gp41- interactive N- and C-terminal gpl20 regions (not shown) . The heavily glycosylated surface of the gpl20 outer domain, which appears to be minimally immunogenic, is also shown.
Figures for the Third Series of Experiments
Figure 8
Overall structure. The ribbon diagram shows gpl20, the N-terminal two domains of CD4 , and the Fab 17b (light chain) and (heavy chain) . The sidechain of Phe 43 on CD4 is also shown. The prominent CDR3 loop of the 17b heavy chain is evident in this orientation. Although the complete N- and C- termini of gpl20 are missing, the positions of the gpl20 termini are consistent with the proposal that gp41, and hence the viral membrane, is located towards the top of the diagram. This would position the target membrane at the diagram base. The vertical dimension of gpl20 in this orientation is roughly 50 A. Precisely perpendicular views of gpl20 are shown in Figures 9 and 11. Drawn -with RIBBONS49.
Figure 9
Structure of core gpl20. The orientation of gpl20 in each of the panels shown in this figure is related to Figure 8 by a 90° rotation about a vertical axis. Thus the viral membrane would be oriented above, the target membrane below, and the C-terminal tail of CD4 coming out of the page. In this view, we describe the left portion of core gpl20 as the "inner" domain, the right portion as the "outer" domain, and the 4-stranded sheet at the bottom left of gpl20 as the "bridging sheet." The bridging sheet (β3, β2, β21, β20) can be seen packing primarily over the inner domain, although some surface residues of the outer domain, e.g. Phe 382, reach in to form part of its hydrophobic core.
Figure 9A
Ribbon diagram. Helices and β-strands are depicted. strand βl5 makes an antiparallel β-sheet alignment with strand C' of CD4. The dashed line to the right of the diagram represents the disordered V4 loop. Selected parts of the structure are labeled.
Figure 9B
Secondary structure diagram. The schematic is arranged to coincide with the orientation of Figures 9A and 9C.
Helices are shown as corkscrews and labeled αl-o-5. β-strands are shown as arrows: black and labeled represent the 25 β-strands of core gpl20; gray and unlabeled represent the continuation of hydrogen bonding across a sheet; white and labeled represents the C' strand of CD4. Spatial proximity between neighboring strands implies mainchain hydrogen bonding. Loops are labeled ζA-ζF and V1-V5. The labels of loops with high sequence variability are circled. Assignment of secondary structure was accomplished with the Kabsch and Sander algorithm except for β4 and β8, which are both interrupted mid-strand by sidechain-backbone hydrogen bonds, β9, βl5, and β25a, all of which have angles or hydrogen bonds which are slightly non-standard, and 4, which hydrogen bonds as a 3-10 helix with the final residue in β-confor ation.
Figure 9C
Stereo plot of an -carbon trace. Every 10th Ca is marked with a filled circle, and every twentieth residue is labeled. Disulfide connections are depicted as ball and stick. Shown are ordered residues, 90-396 and 410-492.
Figure 9D
Structure-based sequence alignment. Shown are the sequences of "HIV-1 B" (core gpl20 from clade B, strain HXBc2 used in these studies), "C" (HIV-1 clade C, strain UG268A2), "0"(HIV-1 clade 0, strain ANT70), "HIV-2" (strain ROD), and "SIV" (African green monkey isolate, clone GRI-1) . The secondary structure assignments are shown as arrows and cylinders, with (x) denoting residues which are disordered in the present structure. The "gars" sequence at the N-terminus and the "gag" sequence in the V1/V2 and V3 loops are consequences of the gpl20 truncation. Solvent accessibility is indicated for each residue by an open circle if the fractional solvent accessibility is greater than 0.4, a half-closed circle if 0.1 to 0.4, and a closed circle if less than 0.1. Sequence variability observed among primate immunodeficiency viruses is indicated below the solvent accessibility by the number of horizontal hash marks: 1 mark, residues conserved among all primate immunodeficiency viruses; 2 marks, conserved among all HIV-1 isolates; 3 marks, exhibits moderate variation among HIV-1 isolates; and 4 marks, exhibits significant variability among HIV-1 isolates. In accessing conservation, all single atom changes were permitted as well as larger substitutions if the character of the sidechain was conserved (e.g. K to R or F to L) . N-linked glycosylation is indicated by "m" for the high mannose additions and "c" for the complex additions observed in mammalian cells (6) . Residues of gpl20 in direct contact with CD4 are indicated by "*". Direct contact is a more restrictive criterion of interaction than the often used loss of solvent accessible surface; residues of gpl20 which show loss of solvent accessible surface but are not in direct contact are 123, 124, 126, 257, 278, 282, 364, 471, 475, 476 and 477. Parts (a) and (b) were drawn with MOLSCRIPT (P. J. Kraulis) .
Figure 10
CD4-gpl20 interactions.
Figure 10A
Ribbon diagram of gpl20 binding to CD4. Residue Phe 43 of CD4 is also depicted reaching into the heart of gpl20. From this orientation the recessed nature of the gpl20 binding pocket is evident.
Figure 10B Electron density in the Phe 43 cavity. The 2Fo-Fc electron density map at 2.5A, 1. lσ contour, is shown. The orientation is the same as in (a) . The foreground has been clipped for clarity removing the overlying β24-α5 connection. In the upper middle of the picture is the central unidentified density. At the bottom of the picture, Phe 43 of CD4 can be seen reaching up to contact the cavity. Moving clockwise around the cavity, the gpl20 residues are Trp 427 (with its indole ring partially clipped by foreground slabbing), Trp 112, Val 255, Thr 257, Glu 370 (packing under the Phe 43 ring), lie 371, and Glu 368 (partially clipped in the bottom right corner) . Hydrophobic residues lining the back of the cavity can be partially glimpsed around the central unidentified density.
Figure IOC
Electrostatic surface of CD4 and gpl20. The electrostatic potential is displayed at the solvent accessible surface, which is shaded according to the local electrostatic potential. The slight "puffiness" of the surface arises from the enlarged nature of the solvent accessible surface relative to the standard molecular surface. On the right, the gpl20 surface is shown in an orientation similar to that of Figures 9A and 9C, but rotated ~20° around a vertical axis to depict the recessed binding pocket more clearly. A thin yellow Ca worm of CD4 is shown to aid in orientation. On the left, the CD4 surface is shown, rotated relative to the gpl20 panel by an exact 180° rotation about the vertical axis shown. A thin red Cα worm of gpl20 is shown.
Figure 10D
CD4-gpl20 contact surface. On the right, the gpl20 surface is shown with the surface within 3.5 A of CD4 (surface-to-atom center distance). This effectively creates an "imprint" of CD4 on the displayed gpl20 surface. On the left (180° rotation), the corresponding CD4 surface and gpl20 "imprint" is also shown.
Figure 10E
CD4-gpl20 mutational "hot-spots." On the right, the surface of gpl20 is shown with the surface of gpl20 residues shown by substitution to affect CD4 binding highlighted: substantial effect -- residues 257, 368, 370 and 427; moderate effect -- residue 457. Also depicted is the surface of the large water-filled cavity at the CD4-gpl20 interface. On the left (180° rotation), residues important for gpl20 binding are shown on the CD4 surface: substantial effect -- residues 43 and 59; moderate effect — residues 29, 35, 44, 46, 47.
Figure 10F
Sidechain/mainchain contribution to the gpl20 surface. The orientation is the same as the right panel of Figures 10C-10E, and below (Figure 10G) , and allows for direct comparison of the CD4-gpl20 contact surface. A striking surface concentration of mainchain atoms is seen in the regions corresponding to the CD4 "imprint."
Figure 10G
Sequence variability mapped to the gpl20 surface. The sequence variability observed among primate immunodeficiency viruses (Figure 9D) is depicted mapped onto the gpl20 surface. Also shown is the carbohydrate: N-acetylglucosamine and fucose residues present in the structure; Asn-proxi al N-acetylglucosamines modeled at residues 88, 230, 241, 356, 397, 406, 462. Much of the carbohydrate (22 residues) is hidden on the back side of the outer domain. Figure 10H
Phe 43 cavity. The surface of the Phe 43 cavity is shown, buried in the heart of gpl20. A worm representation of gpl20 shows the three stretches that are incorrectly predicted by secondary structure prediction: the ζB loop, bending around the top of the cavity, strands β20-β21 just below the cavity, and strand βl5, slightly more distal to the cavity right. The orientation shown here is the same as for the gpl20 surfaces in Figure 10C-10G.
Figure 101
Schematic of the CD4-gpl20 interface. This schematic of the entire interface shows six discrete segments of gpl20 (solid black line) interacting with CD4 (double line) . To aid in orientation, secondary structural elements are labeled, as are representative contact residues from each segment of gpl20. Arrows indicate mainchain direction. The sidechain of Phe 43 is also shown. The orientation shown is similar to Figure 10A and 10B.
Figure 10J
Schematic of gpl20 contacts around Phe 43 and Arg 59 of CD4. Residues on gpl20 involved in direct contact with Phe 43 or Arg 59 are depicted. Electrostatic interactions are depicted as dashed lines. Hydrophobic interactions are found between Phe 43 (CD4) and Trp 427, Glu 370, Gly 473, and He 371 (all from gpl20) and between Arg 59 (CD4) and Val 430 (gpl20) . The orientation is similar to Figure 10A, 10B, and 101, but has been rotated for clarity. Sidechains of Phe 43 and Arg 59 as well as those portions of gpl20 sidechains which interact with these crucial CD4 residues are drawn with bold lines. (Figure 10A was drawn with RIBBONS49, Figure 10B with the program O47, and Figures 10B-10G with GRASP50.)
Figure 11 Neutralizing antibody 17b-gpl20 interface.
Figure HA
Worm diagram of Fab 17b and gpl20. The Fab 17b is shown binding to gpl20. The orientation shown is the same as in Figures 9A and 9C.
Figure 11B
Contact surface and V3 loop. The surface of gpl20 is shown with any surface within 3.5 A of Fab 17b (surface-to-atom center) and the surface of the V3 base. The orientation is the same as in Figure HA.
Figure HC
Contact surface and V3 loop. The same as Figure HB, but rotated around a horizontal axis to more clearly depict the 17b epitope.
Figure HP
Electrostatic surface. The electrostatic potential is displayed at the solvent accessible surface, which is shaded according to the local electrostatic potential.
The electrostatic shading is the same scale as that shown in Figure IOC. The surface that corresponds to the 17b epitope is the most electropositive region of the molecule. The V3 loop is truncated here, but sequence analysis shows that it is generally quite positively charged.
Figure HE Worm diagram of gpl20. The gpl20 is shown shaded according to the same scheme given in Figure HA. The orientation is the same as in Figures HC and 11D, that is, 90° from Figure HA.
Figure 12
Schematic representation of the gpl20 initiation of fusion. A single monomer of core gpl20 is depicted in an orientation similar to Figures 9A and 9C. The "3" symbolizes the 3-fold axis, from which gp41 interacts with the gpl20 N- and C- termini to generate the functional oligomer. In the initial state of gpl20 (on the surface of a virion) , the V1/V2 loops are shown partially occluding the CD4 binding site. Following CD4 binding (now at a target cell, though above the glycocalyx) , a conformational change is depicted as an inner/outer domain shift, with the dark circle denoting the formation of the Phe 43 cavity. This conformational change strains the interactions at the N- and C- termini of gpl20 with the rest of the oligomer, priming the CD4-bound gpl20 core. In the next step (which takes place directly adjacent to the target membrane), the chemokine receptor binds to the bridging sheet and the V3 loop (at the bottom left and right, respectively, of gpl20), causing an orientational shift of core gpl20 relative to the oligomer. This triggers further steps, which ultimately lead to the fusion of the viral and target membranes.
Figure 13 Structure of HIV-1 gpl20 with neutralizing antibody and human receptor CD .
Figures for the Fourth Series of Experiments Figure 14A
Structure and orientation of the HIV-1 gpl20 core. Ca tracing of the gpl20 core, which was crystallized in a ternary complex with two-domain sCD4 and Fab fragment of the 17b antibody (12) , is shown. The gpl20 core is seen from the perspective of CD4 , and is oriented with the viral membrane at the top of the figure and the target cell membrane at the bottom. The N- and C-termini of the truncated gpl20 core are labeled, as are the positions of structures related to the gpl20 variable regions, V1-V5. The Ld and I-t surface loops (12) are shown. The position of the "Phe 43" cavity involved in CD4 binding is indicated by an asterisk. A gpl20 surface implicated in binding to the CCR5 chemokine receptor (C. Rizzuto and J. Sodroski, submitted) is indicated. The perspectives in Figures 14B, C and D are indicated.
Figure 14B View of the molecular surface of the gpl20 outer domain, from the perspective indicated in Figure 14A. The molecular surface in the figure on the left is shaded according to the variability observed in gpl20 residues among primate immunodeficiency viruses. The variability of the gpl20 surface shown is underestimated since the V4 variable loop, which is not resolved in the structure, contributes to this surface. The position of the V5 region is shown. Also note the highly conserved glycosylation site (asparagine 356 and threonine/serine 358) within the Le loop, between the V5 and V4 regions. In the figure on the right, the V4 loop and the carbohydrates are modeled, as described in Materials and Methods .
Figure 14C
View of the gpl20 molecular surface facing the target cell. Variability is indicated in the figure on the left, using the shading scheme as in Figure 14B . Note the clear demarcation between the conserved surface, which has been implicated in the formation of CD4i epitopes (18) and in chemokine receptor binding (C. Rizzuto and J. Sodroski, unpublished observations), and the variable surface of the outer domain. The recessed binding site for CD4 is indicated, flanked by the V1/V2 stem, which is labeled. The V4 loop and the carbohydrates are modeled in the figure on the right. The figure is shaded as indicated in Figure 14B particularly carbohydrates referred to elsewhere in this report are labeled.
Figure 14D View of the molecular surface of the gpl20 core inner domain. In the figure on the left, variability is indicated by the shading scheme used in Figure 14B . The CD4-binding site is to the right of the figure, and the protruding V1/V2 stem is indicated. The conserved molecular surface, which is associated with the inner domain of the gpl20 core, is devoid of know N- linked glycosylation. These are modeled in the figure on the right, which is shaded as described in Figure 14B .
Figure 15
The spatial relationship of epitopes on the HIV-1 gpl20 glycoprotein .
Figure 15A The molecular surface of the gpl20 core is shown, from the same perspective as that in Figure 14A. The modeled N-terminal gpl20 core residues, V4 loop and carbohydrate structures are included. The variability of the molecular surface is indicated, using the shading scheme described in Figure 14B . The approximate locations of the V2 and V3 variable loops are indicated. Note the well-conserved surfaces near the "Phe 43" cavity and the chemokine receptor- binding site (see Figure 14A) .
Figure 15B
A Co. tracing of the gpl20 core, oriented similarly to Figure 14A. The gpl20 residues within Figure 17A of the 17b CD4i antibody are shown. The residues implicated in the binding of CD4BS antibodyies (20) are shown. Changes in these residues significantly affect the binding of at least 25 percent of the CD4BS antibodies listed in the table from the fourth series of experiments. The residues implicated in 2G12 binding (19) are shown. The V4 variable loop, which contributes to the 2G12 epitope, (19) is indicated by dotted lines (see figure 14A) .
Figure 15C
The molecular surface of the gpl20 core, oriented and shaded as in Figure 15B, is shown.
Figure 15D
Approximate locations of the faces of the gpl20 core, defined by the interaction of gpl20 and antibodies. The molecular surface accessible to neutralizing ligands
(CD4 and CD4BS, CD4i and 2G12 antibodies) is shown in white. The neutralizing face of the complete gpl20 glycoprotein includes the V2 and V3 loops, which reside adjacent to the surface shown (see Figure 15A) . The approximate location of the gpl20 face that is poorly accessible on the assembled envelope glycoprotein trimer and therefore elicits only non-neutralizing antibodies (5 , 6) is shown. The approximate location of an immulogically "silent" face of gpl20, which roughly corresponds to the highly glycosylated outer domain surface, is also shown.
Figure 16
A likely arragement of the HIV-1 gpl20 glycoproteins in a trimeric complex. The gpl20 core was organized into a trimeric array, based on the criteria discussed in the text. The perspective if from the target cell membrane, similar to that shown in Figure 14C. The CD4 binding pockets are indicated by black arrows, and the chemokine receptor-binding regions are darkly shaded. The lightly shaded areas indicate the more variable, glycosylated surface of the gpl20 core. The approximate locations of the 2G12 epitopes are indicated by open arrows. The approximate locations for the V3 loops and V4 regions are shown. The positions of the V5 regions and some complex carbohydrate addition sites (asparaginase 276, 463, 356, 397 and 406) are shown. The approximate locations of the large V1/V2 loops, centered on the known positions of the VI/V2 stems, are indicated. On one of the gpl20 subunits, the positions of the LD and LE loops are indicated. The distance of each of the gpl20 monomers from the 3 -fold symmetry axis is arbitrary.
Figures for the Fourth Series of Experiments
Figure 17
The HIV gpl20 derivative used in the binding assay. The wild-type gpl20 and gp41 envelope glycoproteins are shown in the upper figure. Conserved (black) and variable (white) regions (25) are indicated. The wtΔ protein, which is derived from the primary macrophage- tropic YU2 HIV-1 isolate (7) , is shown beneath the wild- type envelope glycoproteins. The N-terminal and V1/V2 deletions correspond to those previously described for the HXBc2 gpl20 mutants Δ82 and Δ128-194, respectively (8,9). SIG=signal peptide.
Figure 18 The gpl20-CCR5 binding assay.
Figure 18A The radiolabeled wtΔ protein was incubated either with the parental Ll .2 cells or with the L1.2-CCR5 cells. Incubations were carried out either in the absence or presence of sCD4 (lOOnM) . The wtΔ protein bound to the cells is shown. The two bands represent different glycoforms of gpl20.
Figure 18B
The wtΔ protein was incubated with both sCD4 and 17b antibody at the indicated concentrations prior to adition to the L1.2-CCR5 cells. The L1.2-CCR5 cells were incubated with 2D7 anti-CCR5 antibody or MIP-13 at the indicated concentrations prior to incubation with wtΔ-sCD4 complexes. The wtΔ protein bound to the cells is shown.
Figure 18C
The amount of radiolabeled wtΔ or selected mutant envelope glycoproteins precipitated by a mixture of HIV- 1-infected patient sera (Total), precipitated by sCD4 and an anti-CD4 antibody (Bound (sCD4) ) , or bound to L1.2- CCR5 cells (Bound (CCR5) ) is shown.
Figure 19 Structure of the HIV-1 gpl20 region implicated in CCR5 binding .
Figure 19A
A ribbon drawing of the HIV-1 gpl20 glycoprotein (6) complexed with CD4 is shown. The perspective is that from the target cell membrane. The two amino-terminal domains of CD4 are shown. The gpl20 inner domain is shown, the outer domain is shown and the "bridging sheet" is shown. The gpl20 residues in which changes resulted in a >90% decrease in CCR5 binding are labeled. The V1/V2 stem and base of the V3 loop (strands l2 and jβl3 and the associated turn) are indicated. Figure 19B
A molecular surface of the gpl20 glycoprotein from the same perspective as that of Figure 19A is shown. Shaded surfaces are associated with gpl20 residues in which changes resulted in either a ≥ 75% decrease, a > 90% decrease or a ≥ 50% increase in CCR5 binding, when CD4 binding was at least 50% of that seen for the wtΔ protein.
Figure 19C
The surface depicted in Figure 19B is shaded according to the degree of conservation observed among primate immunodeficiency viruses (25) .
Figure 19D
The molecular surface of the gpl20 glycoprotein is shown, indicating residues in which changes resulted in a > 70% decease in 17b antibody binding, in the absence of sCD4.
Figure 19E
The molecular surface of the gpl20 glycoprotein is shown, indicating residues in which changes resulted in a > 70% decrease in CG10 antibody binding in the presence of sCD4. Residues in which changes significantly decreased CD4 binding (and thus indirectly decreased CG10 binding) are not shown. Images were made with Midas-Plus (Computer Graphics Lab, University of California, San Francisco) and GRASP (26) .
Figure 20
Shows the x-ray crystallography obtained atomic coordinate data of the gpl20 ternary complex of HIV-1 GP120 complexed with CD4 and Fab 17b having space group P2221 and unit cell dimensions a=71.643, b=88.130, c=196.7. The raw and the coordinates were described in U.S. Serial No. 09/100,764, filed June 18, 1998 and U.S. U.S. Serial No. 08/967,708, filed November 10, 1997 on which this subject application claims priority. These priority documents are available for public inspection. The contents of these applications are incorporated into this application by references. The coordinates have been deposited in the in the Brookhaven Protein Data Bank with the accession code Igcl. In addition, the coordinates may be obtained in the worldwide web: www.pbd.bnl.gov after inputting "Igcl" for the above coordinates.
Figure 21
Provides a detailed list of all the contacts between gpl20 and CD4.
Detailed Description of the Invention
The invention relates to a crystals of gpl20 suitable for x-ray diffraction. The three dimensional structure of gpl20 provides information which has a number of uses; principally related to the development of pharmaceutical compositions which mimic the action of gpl20.
The essence of the invention resides in the obtaining of crystals of gpl20 of sufficient quality to determine the three dimensional (tertiary) structure of the protein by x-ray diffraction methods.
This invention provides crystals of sufficient quality to obtain a determination of the three-dimensional structure of gpl20 to high resolution, preferably to the resolution of 2.5 angstroms.
The value of crystals of gpl20 extends beyond merely being able to obtain a structure for gpl20. The knowledge of the structure of gpl20 provides a means of investigating the mechanism of action of these proteins in the body. For example, binding of these proteins to various receptor molecules can be predicted by various computer models. Upon discovering that such binding in fact takes place, knowledge of the protein structure then allows chemists to design and attempt to synthesize molecules which mimic the binding of gpl20 to its receptors. This is the method of "rational" drug design.
One skilled in the art may use one of several methods to screen chemical entities for their ability to associate with gpl20. This process may begin by visual inspection of, for example, the active site on the computer screen based on the gpl20 coordinates. Docking may be accomplished using software such as Quanta and Sybyl , followed by energy minimization and molecular dynamics with standard molecular mechanics forcefields, such as CHARMM and AMBER.
Specialized computer programs may also assist in the process of selecting fragments or chemical entities. These include:
GRID [P.J. Goodford, "A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macromolecules" , J. Med. Chem. 28:849-857 (1985)]. GRID is available from Oxford Universit, Oxford, UK.
MCSS [A. Miranker and M. Karplus, "Functionality Maps of Binding Sites: A Multiple Copy Simultaneous Search Method", Proteins: Structure, Function and Genetics, 11:29-34 (1991)]. MCSS is available from Molecular Systems, Burlington, MA.
AUTODOCK [D.S. Goodsell and A. J. Olsen, "Automated Docking of Substrates to Proteins by Simulated Annealing", Proteins, Structure, Function, and Genetics, 195-202 (1990)] AUTODOCK is available from Scripps Research Institute, La Jolla, CA.
Once suitable entities or fragments have been selected, they can be assembled into a single compound or inhibitor. Assembly may be proceeded by visual inspection of the relationship of the fragments to each other on the three-dimensional image displayed on a computer screen in relation to the structure coordinates of gpl20. This would be followed by manual model building using software as Quanta or Sybyl.
Useful programs to aid one of skill in the art in connecting the individual chemical entities or fragments include :
CAVEAT [P.A. Bartell et al . , "CAVEAT: A Program of Facilitate the Structure-Derived Design of Biologically Active Molecules" , in Molecular Recognition in Chemical and Biological Problems", Special Pub., Royal Chem. Soc . 78, pp. 182-196 (1989)]. CAVEAT is available from the University of California, Berkeley, CA.
3D Database systems such as MACCS-3D (MDL Information Systems, San Leandro, CA) . This area is reviewed in Y. C. Martin, "3D Database Searching in Drug Design", J. Med. Chem., 35:2145-2154 (1992).
Instead of proceeding to build a gpl20 inhibitor in a step-wise fashion one fragment or chemical entity at a time as described above, inhibitory or other type of binding compounds may be designed as a whole or "de novo" using either an empty active site or optionally including some portion (s) of a known inhibitor (s) . These methods include:
LUDI [H.-J. Bohm "The Computer Program LUDI : A New Method for the De Novo Design of Enzyme Inhibitors", J. Comp. Aid. Molec . Design, 6:61-78 (1992)]. LUDI is available from Biosym Technologies, San Diego, CA.
LEGEND [Y. Nishibata and A. Itai, Tetrahedron, 47:8985 (1991)]. LENGEND is available from Molecular Simulations, Burlington, MA.
Other molecular modeling techniques may also be employed in accordance with this invention. See, e.g., N.C. Cohen et al, "Molecular Modeling Software and Methods for Medicinal Chemistry", J. Med. Chem., 33:883-894
(1990) . See also, M.A. Navia and M.A. Murcko, "The Use of Structural Information in Drug Design" , Current Opinions in Structural Biology, 2:202-210 (1992). For example, where the structures of test compounds are known, a model of the test compound may be superimposed over the model of the structure of the invention. Numerous methods and techniques are known in the art for performing this step, any of which may be used. See, e.g., P.S. Farmer, Drug Design, Ariens, E.J., ed., Vol. 10, pp. 119-143 (Academic Press, New York 1980); U.S. Patent No. 5,331,573; U.S. Patent No. 5,500,807; C. Verlinde, Structure, 2:577-587 (1994); and I.D. Kuntz, Science 257:1078-1082 (1992). The model building techniques and computer evaluation systems described herein are not a limitation on the present invention.
Thus, using these computer evaluation systems, a large number of compounds may be quickly and easily examined and expensive and lengthy biochemical testing avoided. Moreover, the need for actual synthesis of many compounds is effectively eliminated.
Once identified by the modeling techniques, the gpl20 or CD4 antagonist may be tested for bioactivity using standard techniques. For example, structure of the invention may be used in binding assays using conventional formats to screen inhibitors . Suitable assays for use herein include, but are not limited to, the enzyme-linked immunosorben assay (ELISA) , or a fluoresence quench assay. Other assay formats may be used; these assay formats are not a limitation on the present invention.
In another aspect, the gpl20 structure of the invention permit the design and identification of synthetic compounds and/or other molecules which have a shape complimentary to the conformation of the gpl20 active site of the invention. Using known computer systems, the coordinates of the gpl20 structure of the invention may be provided in machine readable form, the test compounds designed and/or screened and their conformations superimposed on the structure of the invention. Subsequently, suitable candidates identified as above may be screened for the desired gpl20 inhibitory bioactivity, stability, and the like.
Once identified and screened for biological activity, these inhibitors may be used therapeutically or prophylactically to block gpl20 activity.
Accordingly, this invention also provides material which is the basis for the rational design of drugs which mimic the action of gpl20.
The subject invention provides a crystal suitable for X- ray diffraction comprising a polypeptide having an amino acid sequence of a portion of a Human Immunodeficiency Virus envelope glycoprotein gpl20.
The subject invention also provides the above-described crystals, which effectively diffract X-rays for determination of the atomic coordinates of the polypeptide to a resolution of 4 angstroms or better than 4 angstroms .
The subject invention also provides the above-described crystals, which effectively diffract X-rays for determination of the atomic coordinates of the polypeptide to a resolution of 2.5 angstroms or better than 2.5 angstroms .
The subject invention also provides the above-described crystals, wherein the portion of gpl20 comprises a CD4 binding site . The subject invention further provides the above- described crystals, further comprising a compound bound to the CD4 site.
The subject invention also provides the above-described crystals, wherein the portion of gpl20 comprises a chemokine receptor binding site.
The subject invention also provides the above-described crystals, further comprising a compound bound to the chemokine receptor binding site.
The subject invention also provides the above-described crystals, wherein the portion of gpl20 comprises a CD4 binding site and a chemokine receptor binding site.
The subject invention also provides the above-described crystals, further comprising of a first compound bound to the CD4 binding site of the polypeptide and a second compound bound to the chemokine receptor binding site of the polypeptide.
The subject invention also provides the above-described crystals, wherein the first compound is the second compound.
The subject invention also provides the above-described crystals, wherein the crystal is arranged in a space group P222-L, so as to form a unit cell of dimensions a=71.6 A, b= 88.1 A, c=196.7 A, and which effectively diffracts x-rays for determination of the atomic coordinates of the gpl20 to a resolution of 2.5 A or better.
The subject invention also provides the above-described crystals, wherein the polypeptide is a variant of gpl20 lacking the VI, V2 , V3 , and C5 regions. The subject invention also provides the above-described crystals, wherein the gpl20 variant comprises a portion of the conserved stem of the V1/V2 stem- loop structure.
The subject invention also provides the above-described crystals, wherein the gpl20 variant comprises a portion of the base of the V3 loop.
The subject invention also provides the above-described crystals, wherein the gpl20 variant comprises a portion of the C5 region.
The subject invention also provides the above-described crystals, wherein the polypeptide is a variant of gpl20 with 5% by weight of the carbohydrate residues linked to the gpl20 in substantially the same manner as they are linked to gpl20 in unmodified gpl20.
The subject invention also provides the above-described crystals, wherein the polypeptide is a variant of gpl20 with 15% by weight of the carbohydrate residues linked to the gpl20 polypeptide in substantially the same manner as they are linked to gpl20 in unmodified gpl20.
The subject invention also provides the above-described crystals, further comprising a Fab, a CD4 , a polypeptide having amino acid sequence of a portion of CD4 , or a combination thereof, bound to the gpl20.
The subject invention also provides the above-described crystals, wherein the Fab is produced from an antibody to a discontinuous epitope.
The subject invention also provides the above-described crystals, wherein the monoclonal antibody is designated 17b. The subject invention' additionally provides a method for producing a crystal suitable for X-ray diffraction comprising: (a) deglycosylating a polypeptide having amino acid sequence of a portion of a gpl20 wherein said portion is produced by deleting or replacing part of the gpl20 to reduce the surface loop flexibility; (b) contacting the polypeptide with a ligand so as to form a complex which exhibits restricted conformational mobility; and (c) obtaining crystal from the complex so formed to produce a crystal suitable for X-ray diffraction.
The subject invention also provides the above-described methods, wherein the VI, V2 , or V3 loop of the gpl20 contained in the polypeptide are partially truncated, deleted or replaced.
The subject invention also provides the above-described methods, wherein the polypeptide lacks the VI, V2 , V3 and C5 loop of the gpl20.
The subject invention also provides the above-described methods, wherein the polypeptide also lacks up to fifty N-terminal amino acids of the gpl20 or up to fifty C- terminal amino acid of gpl20.
The subject invention also provides the above-described methods, wherein the ligand is a Fab, a CD4 , or a polypeptide having amino acid sequence of a portion of CD4.
The subject invention also provides the above-described methods, wherein the resulting polypeptide after the deglycosylation contains at least 5% of the carbohydrate.
The subject invention also provides the crystal produced by the above-described methods.
The subject invention also provides a method for identifying a compound capable of binding to a portion of Human Immunodeficiency Virus envelope glycoprotein gpl20 comprising: (a) determining a binding site on the portion of gpl20 based on the atomic coordinates computed from X-ray diffraction data of a crystal comprising the portion of gpl20; and (b) determining whether a compound would fit into the binding site, a positive fitting indicating that the compound is capable of binding to the gpl20.
The subject invention also provides a method for designing a compound capable of binding to a portion of Human Immunodeficiency Virus envelope glycoprotein gpl20 comprising: (a) determining a binding site on the portion of gpl20 based on the atomic coordinates computed from X-ray diffraction data of a crystal comprising the portion of gpl20; and (b) designing a compound to fit the binding site.
Structure-based drug design has been known and was previously described. See e.g., Bugg et al . (1993) Sci. Amer., December: 92-98; Giranda (1994) Structure, 2:695- 698; Lam et al . (1994) Science 263:380-384; and Navia et al. (1994) Circulation 89 (4) : 1557-1566.
The subject invention also provides the above-described methods, wherein the fitting is determined by shape complementarity or by estimated interaction energy.
The subject invention also provides the above-described methods, wherein the X-ray diffraction data are set forth in Table A. The subject invention also provides the above-described methods, wherein the atomic coordinates are set forth in Table B.
The subject invention also provides a pharmaceutical composition comprising the compound identified by the above-described methods and a pharmaceutically acceptable carrier.
For the purposes of this invention "pharmaceutically acceptable carriers" means any of the standard pharmaceutical carriers. Examples of suitable carriers are well known in the art and may include, but not limited to, any of the standard pharmaceutical carriers such as a phosphate buffered saline solutions, phosphate buffered saline containing Polysorb 80, water, emulsions such as oil/water emulsion, and various type of wetting agents. Other carriers may also include sterile solutions, tablets, coated tablets, and capsules.
Typically such carriers contain excipients such as starch, milk, sugar, certain types of clay, gelatin, stearic acid or salts thereof, magnesium or calcium sterate, talc, vegetable fats or oils, gums, glycols, or other known excipients. Such carriers may also include flavor and color additives or other ingredients. Compositions comprising such carriers are formulated by well known conventional methods.
The subject invention also provides the above-described methods, wherein the compound is not previously known.
The subject invention also provides the compounds identified by the above-described methods.
The subject invention also provides the compound designed by the above-described methods. The subject invention also provides a composition comprising the above-described compounds and a suitable carrier.
This invention provides a method of inhibiting the interaction of HIV-gpl20 with CD4 which comprises administering to a mammal in need thereof a compound, with the proviso that the compound is not CD4 , which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupts two or more of the following interactions: (a) a benzyl group that binds to the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine 371 at a distance of 3.4 ang. or otherwise disrupting the hydrophobic interaction between the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 and CD4 phenylalanine 43; (b) a phenyl group that binds to the side chain carboxylate group of gpl20 aspartic acid 368 at a distance of 3.1 ang. or otherwise disrupting the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43; (c) a phenyl group that binds to the alpha methylene group of gpl20 tryptophan 427 at a distance of 3.5 ang. or otherwise or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43 : (d) a phenyl group that binds to the alpha methylene group of gpl20 glycine 473 at a distance of 3.8 ang. or otherwise or disrupts the hydrophobic interation between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43; (e) a phenyl group that binds to the alpha carbonyl group of gpl20 glycine 473 at a distance of 3.7 ang. or otherwise or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43; (f) a phenyl group tht binds to the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 at a distance of 3.4-3.5 ang. or otherwise or disrupts the hydrophobic interaction between the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43; (g) a phenyl group that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 at a distance of 3.1 ang. or otherwise or disrupts the dipolar interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43 ; (h) a propylguanidinium group that binds to the side chain carboxyl group of gp 120 aspartic acid 368 at a distance of 1.7 ang. or otherwise or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59; (i) a propylguanidinium group that binds to the side chain isopropyl (or methyl) group of gpl20 valine (or alanine) 430 at a distance of 3.3 ang. or otherwise or disrupts the ionic interaction between the side chain isopropyl (or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59; and/or (j) an amide group that binds to the side chain propylalcohol group of gpl20 threonine 123 at a distance of 4.6 ang. or otherwise or disrupts the hydrogen bond interaction between the side chain propylalcohol group of threonine 123 and the alpha carbonyl group of CD4 Arg 59.
As used herein said distance is the distance between nearest interacting heavy atoms in said groups of gpl20 and CD4 in the crystal structure. Said distances to comparable groups in other gpl20 isolates (shown in parentheses) have not been measured. Side chains do not include alpha carbons or alpha substituents.
A method of inhibiting the interaction of HIV-gpl20 with CD4 which comprises administering to a mammal in need thereof a compound, with the proviso that the compound is not CD4 , which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupts two or more of the following interactions:
a. a benzyl group that binds to the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 at a distance of 3.4 ang. or otherwise or disrupts the hydrophobic interaction between the side chain isobutyl (or isopropyl group) og gpl20 isoleucine (or valine) 371 and CD4 phenylalanine 43;
b. a benzyl group that binds to the side chain carboxylate group of gpl20 aspartic acid 368 at a distance of 3.1 ang. or otherwise or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine
43;
c . a phenyl group that binds to the alpha methylene group of gpl20 tryptophan 427 at a distance of 3.5 ang. or otherwise or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
d. a phenyl group that binds to the alpha methylene group of gpl20 glycine 473 at a distance of 3.8 ang. or otherwise or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
e . a phenyl group that binds to the alpha carbonyl group of gpl20 glycine 473 at a distance of 3.7 ang. or otherwise or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
f . a phenyl group that binds to the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 at a distance of 3.4-3.5 ang. or otherwise or disrupts the hydrophobic interactio nbetween the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43;
g. a phenyl group that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 at a distance of 3.1 ang. or otherwise or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
h. a propylguanidinium group that binds to the side chain carboxyl group of gpl20 aspartic acid 368 at a distance of 1.7 ang. or otherwise or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59;
i. a propylguanidinium group that binds to the side chain isopropyl (or methyl) group of gpl20 valine (or alanine) 430 at a distance of 3.3 ang. or otherwise or disrupts the ionic interaction between the side chain isopropyl (or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59; j . an amide group that binds to the side chain propylalcohol group of gpl20 threonine 123 at a distance of 4.6 ang. or otherwise or disrupts the hydrogen bond interaction between the side chain propylalcohol group of threonine 123 and the alpha carbonyl group of CD4 Arg 59;
k. a propionamide group that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 at a distance of 3.6 ang. or otherwise or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 with the side chain propionamid group of CD4 glutamine 40;
1. a propionamide group that binds to the alpha amino group of phl20 aspartic acid (or asparagine) 474 at a distance of 3.6 ang. or otherwise or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the side chain propionamide group of CD4 glutamine 40;
m. an amide group that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.4 ang. or otherwise or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the alpha carbonyl group of CD4 glutamine 40;
n. a methyl alcohol group that binds to the alpha amino group of gpl20 lysine 429 at a distance of 3.2 ang. or otherwise or disrupts the hydrogen bond interaction betweenthe alpha amino group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
o. a methyl alcohol group that binds to the alpha carbonyl group of gpl20 lysine 429 at a distance of 3.2 ang. or otherwise or disrupts the hydrogen bond interaction betweenthe alpha carbonyl group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
p. a methyl alcohol group that binds to the alpha carbonyl group of gpl20 tryptophan 427 at a distance of 3.2 ang. or otherwise or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 trptophan 427 and the side chain hydroxyl group of CD4 serine 42;
q. a methyl alcohol group that binds to the alpha amino group of gpl20 valine (or alanine) 430 at a distance of 3.4 ang. or otherwise or disrupts the hydrogen bond between the alpha amino group of gpl20 valine (or alanine) 430 and the sidechain hydroxyl group of CD4 serine 42;
r. a methyl alcohol group that binds to the alpha carbonyl group of gpl20 methionine (or serine 426 at a distance of 3.7 ang. or otherwise or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 methionine (or serine) 426 and the sidechain hydroxyl group of CD4 serine 42;
s . an amido group that binds to the alpha carbonyl group of gpl20 glycine 473 at a distance of 3.9 ang. or otherwise or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine
473 and the alpha amino group of CD4 serine 42;
t . an amido group that binds to the alpha carbonyl group of gpl20 tryptophan 427 at a distance of 4.5 ang. or otherwise or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 tryptophan 427 and the alpha amino group of CD4 serine 42;
u. an amido group that binds to the sidechain carboxyl group of gpl20 aspartic acid 368 at a distance of
3.2 ang. or otherwise or disrupts the hydrogen bond between the sidechain carboxyl group of gpl20 aspartic acid 368 and the alpha amino group of CD4 leucine 44 ;
v. an amido group that binds to the alpha amino group of gpl20 aspartic acid 368 at a distance of 3.4 ang. or otherwise or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid 368 and the alpha carbonyl group of CD4 leucine 44;
w. an hydroxpropyl group that binds to the isobutyl (or isopropyl) group of gp 120 isoleucine (or valine) 271 at a distance of 3.9 ang. or otherwise or dirupts the hydrophobic interaction betwee the isobutyl group of gpl20 isoleucine (or valine) 271 and the sidechain hydroxypropyl group of CD4 threonine 45;
x. an amido group that binds to the alpha amino group of gpl20 glycine 366 at a distance of 3.5 ang. or otherwise or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 120 glycine 366 and the alpha carbonyl group of CD4 lysine 46;
y. an amido group that binds to the alpha carbonyl group of gpl20 glycine 366 at a distance of 3.3 ang. or otherwise or disrupts the hydrogen bond interaction between the alpha carbonyl group of
gpl20 glycine 366 and the alpha amino group of CD4 lysine 46; z . an amido group that binds to the alpha amino group of gpl20 glycine 367 at a distance of 2.9 ang. or otherwise or disrupts the hydrogen bond interacton between the alpha amino group of gpl20 glycine 367 and the alpha carbonyl group of CD4 lysine 46;
al . a butylammonium group that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang. or otherwise or disrupts the hydrogen bond interaction between the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 and the side chain butylammonium group of CD4 lysine 29;
bl . an amide group that binds to the alpha methylene (or methine) group of gpl20 glycine (or valine) 459 at a distance of 3.1 ang. or otherwise or disrupts the dipolar interaction between the alpha methylene (or methine) group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
cl . an amido group that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang. or otherwise or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
dl . an amido group that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 at a distance of 2.6 ang. or otherwise or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the alpha carbonyl group of CD4 glutamine 33; el . a propionamido group that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 at a distance of 4.2 ang. or otherwise or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the sidechain amide group of CD4 glutamine 33;
fl. a propionamido group that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.9 ang. or otherwise or disrupts the hydrogen bond between the alpha amino group of gpl20 alycine (or valine) 459 with the sidechain propionamido group of CD4 glutamine 33; and/or
gl . an aceta ido group that binds to the alpha carbonyl group of gpl20 serine (or alanine) 365 at a distance of 3.4 ang. or otherwise or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 serine (or alanine) 365 with the sidechain amide of CD4 asparagine 52;
This invention also provides a method of inhibiting the interaction of HIV-gpl20 with CD4 which comprises administering to a mammal in need thereof a compound, with the proviso that the compound is not CD4 , capable of disrupting two or more of the contacts between gpl20 and CD4 as set forth in table C.
This invention also provides a method for identifying a compound capable of binding to the CD4 binding site of Human Immunodeficiency Virus envelope glycoprotein gpl20 comprising: (a) determining the CD4 binding site on the gpl20 based on the atomic coordinates computed from X- ray diffraction data of a crystal comprising a polypeptide having amino acid sequence of a portion of gpl20 capable of binding to CD4 ; and (b) determining whether a compound would fit into the binding site, a positive fitting indicating that the compound is capable of binding to the CD4 binding site of the gpl20.
This invention also provides a method for designing a compound capable of binding to the CD4 binding site of Human Immunodeficiency Virus envelope glycoprotein gpl20 comprising: (a) determining the CD4 binding site on the gpl20 based on the atomic coordinates computed from X- ray diffraction data of a crystal comprising a polypeptide having amino acid sequence of a portion of gpl20 capable of binding to CD4 ; and (b) designing a compound to fit the CD4 binding site.
This invention also provides the above-described methods, wherein the crystal further comprising a CD4 , a second polypeptide having amino acid sequence of a portion of CD4 , or a compound known to be able to bind to the CD4 site of the gpl20, bound to the polypeptide.
This invention also provides the above-described methods, wherein the fitting is determined by shape complementarity or by estimated interaction energy.
This invention also provides the above-described methods, wherein the X-ray diffraction data are set forth in Table A.
This invention also provides the above-described methods, wherein the atomic coordinates are set forth in Table B.
This invention also provides a pharmaceutical composition comprising the compound identified the by above-described methods and a pharmaceutically acceptable carrier. This invention also provides the above-described methods, wherein the compound is not previously known.
This invention also provides the compound identified by the above-described methods.
This invention also provides the compound designed by the above-described methods.
This invention also provides a composition comprising the above-described compounds and a suitable carrier.
This invention also provides a method of inhibiting Human Immunodeficiency Virus infection in a subject comprising adminstering effective of amount of the above-described composition to the subject.
In embodiments of the above-described methods, the above-described compounds are nonpeptidyl.
This invention provides a method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammal in need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupt two or more of the following interactions:
a. an alkyl group, R, aromatic or heteraromatic group, Het, that binds to the side chain isobutyl (isopropyl or methyl group) of gpl20 isoleucine (valine or alanine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl (isopropyl or methyl group) of gpl20 isoleucine (valine, or alanine) 371 and CD4 phenylalanine 43;
b. an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43, wherein Het is phenyl, Bn, EtPh, or heteroarylalkyl
c. a group X that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43, wherein X is hydroxyalkyl, hydroxyaryl, alkylamide, or arylamide;
d. an aromatic group or heteroaromatic group, Het, that binds to the side chain indole group of gpl20 tryptophan 427 or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
e. an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
f . an aromatic group or heteroaromatic group,
Het, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43 ;
g. an alkyl group, R, that binds to the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43, wherein R is alkyl, cycloalkyl, or haloalkyl;
h. an aromatic group of heteroaromatic group, Het, that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
i . a group X that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of gpl20 asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
j . a group Y that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 ky 1 ammo n i urn , a r y 1 a mm o n i urn , aryl alkyl ammonium , alkylguanidinium , piperidinium, pyrollidinium, or pyridinium; k. an alkyl group, R, aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of gpl20 valine
(or alanine) 430 or disrupts the hydrophobic interaction between the side chain isopropyl
(or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59;
1. a group, Z, that binds to the side chain propylalcohol group of gpl20 threonine 123 or disrupts the side chain propylalcohol group of threonine 123 and the hydrogen bond interaction between the alpha carbonyl group of CD4 Arg 59, wherein Z is alkoxyalkyl, aryloxyalkyl , alkoxyaryl, haloalkyl, haloaryl, alkylamide, arylamide, alkylcarboxylate, arylcarboxylate, arylalkyl ester, dialky ester, or alkylarl ester.
This invention also provides a method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammalin need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupt two or more of the following interactions;
a. an aromatic group or heteroaromatic group,
Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43;
b. an aromatic group of heteroaromatic group, Het, that binds to the alpha, beta or gamma carbons fo the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43;
c. an aromatic group or heteroaromatic group, Het, that binds to the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 and CD4 phenylalanine 43;
d. an aromatic group or heteroaromatic group, Het, that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
e. an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 tryptophan 427 or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
f. an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43 ;
g. an aromatic group or heteroaromatic group,
Het, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
h. a group Y that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 k y 1 a mm o n i u m , a r y 1 a mmo n i u m , arylalkylammonium , alkylguanidinium, piperidinium, pyrollidinium, or pyridinium;
i . an alkyl group, R, aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of ghl20 valine (or alanine) 430 or disrupts eh hydrophobic interaction between the side chain isopropyl (or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59;
j . a group, Z, that binds to the side chain propylalcohol group of gpl20 threonine 123 or disrupts the side chain propylalcohol group of threonine 123 and the hydrogen bond interaction between the alpha carbonyl group CD4 Arg 59; and/or
k. a group, X, that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 with the side chain amide group of CD4 glutamine 40;
1. a group, Z, that binds tot he alpha amino group of gpl20 glycine (alanine, or glutamic acid) 472 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (alanine, or glutamic acid) 473 with the side chain propionamide group of CD4 glutamine 40;
m. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the side chain propionamide group of CD4 glutamine 40;
n. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine)
474 at a distance of 3.4 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the alpha carbonyl group of CD4 glutamine 40;
o. a group, X, that binds to the alpha carbonyl group of gpl20 methionine (or serine) 426 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 methionine (or serine) 426 and the side chain hydroxyl group of CD4 serine 42; p. a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the alpha amino group of CD4 serine 42;
q. a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the side chain hydroxyl group of CD4 serine 42;
r. a group, X, that binds to the alpha amino group of gpl20 lysine 429 or dsrupts the hydrogen bond interaction between the alpha amino group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
s. a group, X, that binds to the alpha carbonyl group of gpl20 lysine or disrupts the hydrogen bond interactio between the alpha carbonyl group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
t. a group, X, that binds to the alpha amino group of gpl20 valine (or alanine) 430 or disrupts the hydrogen bond between the alpha amino group of gpl20 valine (or alanine) 430 and the side chain hydroxyl group of CD4 serine 42;
u. a group, X, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine 473 and the alpha amino group of CD4 serine 42 ;
v. a group, X or Y, that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the side chain carboxyl group of gpl20 aspartic acid 368 and the alpha amino group of CD4 leucine 44;
w. a group, X, that binds to the alpha amino group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid 368 and the alpha carbonyl group CD4 leucine 44;
x. an alkyl group, R, that binds to the isobutyl
(or isopropyl) group of gpl20 isoleucine (or valine) 271 or disurpts the hydrophobic interaction between the isobutyl group of gpl20 isoleucine (or valine) 271 and the side chain hydroxypropl group of CD4 threonine 45;
y. a group, X, that binds to the alpha amino group of gpl20 glycine 366 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 366 and the alpha carbonyl group of CD4 lysine 46;
z. a group, X, that binds to the alpha carbonyl group ofgpl20 glycine 366 or disrupts the hydrogen bond interaction between the alpha carbonyl group of gpl20 glycine 366 and the alpha amino group of CD4 lysine 46;
al . a group, X, that binds to the alpha amino group of gpl20 clycine 367 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 367 and the alpha carbonyl group of CD4 lysine 46;
bl . a group, Y, that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang., or disrupts the hydrogen bond interaction between the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or seine) 280 and the side chain butylammonium group of CD4 lysine 29;
cl . a group, Q, that binds to the alpha methylene
(or methine) group of gpl20 glycine (or valine) 459 at a distance of 3.1 ang., or disrupts the dipolar interaction between the alpha methylene (or methine) group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32, wherein Q is dialkylketone , alkylarylketone , or arylalkylketone ;
dl . a group, Z, that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang., or disrupts the hydrogen bond interaction betweeen the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
el. a group, X, that binds to the side chain amide
(or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the alpha carbonyl group of CD4 glutamine 33; f1. a group, X, that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the side chain amide group of CD4 glutamine 33;
gl . a group, X, that binds to the alpha amino group of gpl20 glycine (or valine) 459 or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (or valine) 459 with the side chain propionamido group of CD4 glutamine 33;
hi. a group, X, that binds to the alpha carbonyl group of gpl20 serine (or alanine) 365 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 serine (or alanine) 365 with the side chain amide of CD4 asparagine 52 ; and/or
il. a group, X, that binds to the side chain isopropylalcohol group of gpl20 threonine 123 at a distance of 3.8 ang., or disrupts the hydrogen bond between the side chain isopropyl alcohol group of gpl20 threonine 123 with the side chain ethyl alcohol group of CD4 serine 60.
This invention also provides a Method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammal in need thereof a compound which contains certain functional groups that interact with HIV-gpl20 in a manner that disrupts two or more of the following interactions: a. an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43;
b. an aromatic group or heteroaromatic group, Het, that binds to the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43;
c. an aromatic group or heteroaromatic group, Het, that binds to the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 and CD4 phenylalanine 43;
d. an aromatic group or heteroaromatic group, Het, that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
e. an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 tryptophan 427 or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
f . an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine
43;
g. an aromatic group or heteroaromatic group,
Het, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
h. a group, Y, that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 k y 1 a mm o n i u m , a r y 1 a mmo n i urn , aryl alkyl ammonium , alkylguanidinium , piperidinium, pyrollidinium, or pyridinium.
i . an alkyl group, R, aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of gpl20 valine (or alanine) 430 or disrupts the hydrophobic interaction between the side chain isopropyl (or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59; j. a group, Z, that binds to the side chain propylalcohol group of gpl20 threonine 123 or disrupts the side chain propylalcohol group of threonine 123 and the hydrogen bond interaction between the alpha carbonyl group of CD4 Arg 59; and/or
k. a group, X, that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 with the side chain amide group of CD4 glutamine 40;
1. a group, Z, that binds tot healpha amino group of gpl20 glycine (alanine, or glutamic acid) 472 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (alanine, or glutamic acid) 473 with the side chain propionamide group of CD4 glutamine 40;
m. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine)
474 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the side chain propionamide group of CD4 glutamine 40;
n. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine)
474 at a distance of 3.4 ang., or disrupts the hydrogen bond betweenthe alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the alpha carbonyl group of CD4 glutamine 40; o. a group, X, that binds to the alpha carbonyl group of gpl20 methionine (or serine) 426 or disrupts the hydrogen bond betweenthe alpha carbonyl group of gpl20 methionine (or serine) 426 and the side chain hydroxyl group of CD4 serine 42;
p. a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the alpha amino group of CD4 serine 42;
q. a group, X, that binds tot he alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the side chain hydroxyl group of CD4 serine 42;
r. a group, X, that binds to the alpha amino group of gpl20 lysine 429 or disrupts the hydrogen bond interaction betweeen the alpha amino group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
s. a group, X, that binds to the alpha carbonyl group of gpl20 lysin or disrupts the hydrogen bond interaction between the alpha carbonyl group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
t. a group, X, that binds to the alpha amino group of gpl20 valine (or alanine) 430 or disrupts the hydrogen bond between the alpha amimo group of gpl20 valine (or alanine) 430 and the side chain hydroxyl group of CD4 serine 42;
u. a group, X, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine 473 and the alpha amino group of CD4 serine 42;
v. a group, X or Y, that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the side chain carboxyl group of gpl20 aspartic acid 368 and the alpha amino group of CD4 leucine 44;
w. a group, X that binds to the alpha amino group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid 368 and the alpha carbonyl group of CD4 leucine 44;
x. an alkyl group R that binds to the isobutl (or isopropyl) group of gpl20 isoleucine (or valine) 271 or disrupts the hydrophobic interaction between the isobutyl group of gpl20 isoleucine (or valine) 271 and the side chain hydroxypropyl group of CD4 threonine 45;
y. a group, X, that binds to the alpha amino group of gpl20 glycine 366 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 366 and the alpha carbonyl group of CD4 lysine 46;
z. a group, X, that binds to the alpha carbonyl group of gpl20 glycine 366 or disrupts the hydrogen bond interaction betweeen the alpha carbonyl group of gpl20 glycine 366 and the alpha amino group of CD4 lysine 46;
al . a group, X, that binds to the alpha amino group of gpl20 glycine 367 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 367 and the alpha carbonyl group of CD4 lysine 46;
bl . a group, Y, that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang., or disrupts the hydrogen bond interaction between the side chain acetamide
(or methyl alcohol) group of gpl20 asparagine (or serine) 280 and the side chain butylammonium group of CD4 lysine 29;
cl. a group, Q, that binds to the alpha methylene
(or methine) group of gpl20 glycine (or valine) 459 at a distance of 3.1 ang., or disrupts the dipolar interaction between the alpha methylene (or methine) group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
dl . a group, Z, that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
el . a group, X, that binds to the side chain amide
(or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the alpha carbonyl group of CD4 glutamine 33;
fl. a group, X, that binds to the side chain amide
(or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the side chain amide group of CD4 glutamine
33;
gl . a group, X, that binds to the alpha amino group of gpl20 glycine (or valine) 459 or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (or valine) 459 with the side chain propionamido group of CD4 glutamine 33;
hi . a group, X, that binds to the alpha carbonyl group of gpl20 serine (or alanine) 365 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 serine (or alanine) 365 with the side chain amide of CD4 asparagine 52; and/or
il. a group, X, that binds to the side chain isopropylalcohol group of gpl20 threonine 123 at a distance of 3.8 ang., or disrupts the hydrogen bond between the side chain isopropyl alcohol group of gpl20 threonine 123 with the side chain ethyl alcohol group of CD4 serine 50.
j 1. an alkyl group, R, or an aromatic or heteroaromatic group that binds to the side chain indole group of gpl20 tryptophan (or phenylalanine) 112 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
kl . an alkyl group, R, or an aromatic or heteroaromatic group that binds to the side chain phenyl group of gpl20 phenylalanine 382 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
11. an aromatic or heteroaromatic group that binds to the side chain phenolic group of gpl20 tyrosine 384 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
ml. an alkyl group, R, that binds to the side chain alkyl group of gpl20 valine (isoleucine, or glutamine) 255 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
nl . a group, X, that binds to the side chain hydroxyl group of gpl20 threonine 257 and/or disurpts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
ol . a group, Y, that binds to the side chain carboxyl group of gpl20 glutamic acid 370 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
pi. an alkyl group, R, that binds to the side chain isobutyl group of gpl20 isoleucine 424 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
ql . an aromatic or heteroaromatic group that binds to the side chain indole group of gpl20 tryptophan 427 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43; and/or
rl . an aromatic or heteroaromatic group that binds to the side chain phenolic group of gpl20 tyrosine 435 and/or disrupts the afroementioned inteactins of gpl20 with CD4 phenylalanine 43.
This invention also provides
Compounds which mimic the CD4 loop described as follows:
42 43 44 45 46 47 48 Ser- -Phe- -Leu- Thr- Lys- -Gly- -Pro-
I
Gly- -Gln- -Asn- -Gly- -Leu- -Ile- -Lys- 41 40 39 38 37 36 35
CD4 Lys-35 to Pro-48
The sidechains of Lys35, Gln40, Phe43, and Thr45 protrude out of the protein (out of the page) , the other sidechains protrude into the exterior of the protein (behind the page)
This invention comprises compounds of formula (I) of formula (II) :
Figure imgf000083_0001
Formula (II) Wherein :
X is a group that is designed to mimic Arg- 59 in CD4 ;
Y is a group that is designed to mimic Phe-43 in CD4;
Z is -NRC(R (R2)C0-;
Figure imgf000084_0001
;or -NHCH(COQ) (CH2) nNH- ;where
Q is -OH, -OR, -NR1#R2 or -NH-Lys-OR; and
W is either or W is NRXR2
Figure imgf000084_0002
W is NR; m=0-14, n=0-6; and
R, Rlrand R2,are the same or different and are hydrogen, alkyl, Bn, EtPh, (cycloalkyl) alkyl , arylalkyl, heteroarylalkyl , haloalkyl, hydroxyalkyl, aminoalkyl, amino acid sidechain. Preferably, X is NH2(CH2)n-, NH=C (NH2)NH (CH2) n- , Het- (CH2)n-.
Preferably, Y is Aryl(CH2)n-; Cyclohexyl (CH2) n- . Unless otherwise indicated, the terms are defined as follows :
The term "alkyl" is used herein at all occurrences to mean a straight or branched chain radical of 1 to 6 carbon atoms, unless the chain length is limited thereto, including, but not limited to methyl, ethyl, n-propyl, isopropyl, n-butyl, sec- butyl , isobutyl, tert-butyl, and the like.
The terms "halo" or "halogen" are used interchangeably herein at all occurrences to mean radicals derived from the elements chlorine, fluorine, iodine and bromine.
The terms "aryl" or "heteroaryl" are used herein at all occurrences to mean substituted and unsubstituted aromatic ring(s) or ring systems which may include bi-or tri-cyclic systems and heteroaryl moieties, which may include, but are not limited to, heteroatoms selected from 0, N, or S. Representative examples include, but are not limited to, phenyl, benzyl, naphthyl, pyridyl, quinolinyl, thiazinyl, and furanyl .
The term "Bn" is used herein at all occurrences to mean benzyl.
The term "Ph" is used herein at all occurrences to mean phenyl .
Specific preferred compounds of this invention are as follows:
Figure imgf000085_0001
Wherein Q=0H, m=0; Z=Leu-ThrNHBn; (B)
)n
Figure imgf000086_0001
(C)
Ser— (Bn) Lys Leu Thr Lys Gly e-NH
I I
Gly Gin Asn Gly Leu He Lys
(D)
Ser (Bn)Lys — -Leu .Thr- ■Lys- •e-NH
I I
Gly Gin Asn Gly- -Leu- •D-Lys — Lys
(E)
Figure imgf000086_0002
(F)
Ser (Bn)Lys Leu
Btd
Gly Gin z
where Btd is 55 -
Figure imgf000087_0001
Compounds of formula (I) and formula (II) can be made according to the following examples. It will be recognized by the skilled artisan that reagents and starting materials are commercially available or can be made by standard methods of peptide synthesis. It will also be recognized that when synthesizing the compounds of this invention, suitably protected amino acid intermediates which are the building blocks of compounds of formula (I) and formula (II) are generated. These amino acid intermediates are assembled to give compounds of formula (I) and formula (II) by known methods.
Synthesis of (Bn)Lys as shown in preferred compounds of formula (I) and formula (II) (B) through (F) above is described in Gander-Coquoz, M. Seebach, D., Helv. Chim . Acta , 1988,71,224, incorporated herein by reference .
Synthesis of Boc-NH- (-Bn) Lys as shown in preferred compound of formula (I), Compound (A) above, is described in Viret, J. , Gabard, J. , and Collet, A., Tetrahedron, 1987, 43, 891-894, incorporated herein by reference, and is shown in the following Scheme : 16 -
Figure imgf000088_0001
a) . KOCN, 60°, 4 h; KOC1, KOH; Boc20 When Z is the following moiety
Figure imgf000088_0002
,see, preferred compound (B) above, the synthesis is described in Freidinger et al . , J. Org . Chem . , 1982, 47,104-109, incorporated herein by reference .
Synthesis of preferred Compound (A) : The synthesis of preferred Compound (A) is based upon the methods of preparation of compounds described in the following, which are all incorporated herein by reference:
a. Michael Kahn, Preparation of CD4 b-Turn Mimetics WO 93/24518 Al , published December 9, 1993. b. Michael Kahn, Preparation of Conformationally Restricted Mimetics of Reverse Turns and Peptides Containing the Same, WO 94/03494 Al, published February 17, 1994. c. Ramurthy, S., Lee, M.S., Nakanishi, H.,Shen, R. , and Kahn, M. , peptidomimetic Antagonists Designed to Inhibit the Binding of CD4 to HIV gpl20, Biorg. Med . Chem . , 1994, 2,1007-1013 d. Chen, S., Chrusciel, R.A. , Nakanishi, H., Raktabutr, A., Johnson, M.E., Sato, A., Weiner, D. , Hoxie, J. , Saragovi, H. U. , Greene, M.I., and Kahn, M., Design and Synthesis of a CD4 β-Turn that Inhibits Human Immunodeficiency Virus Envelope Glycoprotein gpl20 Binding and Infection of Human Lymphocytes, Proc . Na tl . Acad . Sci . USA 1992, 89,5872-5876.
In the preferred compound (F) above, moiety Btd. is synthesized by the procedure described in Ngai et al., Tetrahedron, 1993, 49, 3577-3592, (incorporated herein by reference) .
The above description fully discloses the invention including preferred embodiments thereof . Modifications and improvements of the embodiments specifically disclosed herein are within the scope of the following claims. Without further elaboration it is believed that one skilled in the art can, given the preceding description, utilize the present invention to its fullest extent. Therefore any examples are to be construed as merely illustrative and not a limitation on the scope of the present invention in any way. The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows . This invention provides a vaccine comprising a polypeptide having 6 or more amino acids in the same spatial proximity to each other as the amino acids from the Phe 43 cavity of naturally occurring gpl20.
This invention also provides the above-described vaccine, wherein the 6 or more amino acids are identical to the amino acids of naturally occurring gpl20.
This invention further provides the above-described vaccines, wherein the amino acids are within 1 angstroms of their distances in naturally occurring gpl20.
This invention also provides the above-described vaccines, wherein the amino acids are within 3 angstroms of their distances in naturally occurring gpl20.
This invention provides the above-described vaccines, wherein the amino acids are within 5 angstroms of their distances in naturally occurring gpl20.
This invention also provides the above-described vaccines, wherein the polypeptide is or is part of a conserved neutralization epitope.
This invention further provides the above-described vaccines, further comprising a carrier.
This invention also provides the above-described vaccines, further comprising an adjuvant.
This invention provides a vaccine comprising a polypeptide having 6 or more continuous amino acids from the Phe 43 cavity of gpl20. This invention provides the above-described vaccines, wherein the polypeptide is or is part of an epitope a conserved neutralization epitope.
This invention also provides the above-described vaccines, further comprising a carrier.
This invention further provides the above-described vaccines, further comprising an adjuvant.
This invention further provides a vaccine comprising a polypeptide having 6 or more amino acids in the same spatial proximity to each other as the surface accessible amino acids adjacent to the Phe 43 cavity of naturally occurring gpl20.
This invention also provides the above-described vaccines, wherein the 6 or more amino acids are identical to the amino acids of naturally occurring gpl20.
This invention provides the above-described vaccines, wherein the amino acids are within 1 angstroms of their distances in naturally occurring gpl20.
This invention also provides the above-described vaccines, wherein the amino acids are within 3 angstroms of their distances in naturally occurring gpl20.
This invention further provides the above-described vaccines, wherein the amino acids are within 5 angstroms of their distances in naturally occurring gpl20.
This invention also provides the above-described vaccines, wherein the polypeptide is or is part of a conserved neutralization epitope. This invention further provides the above -described vaccines, further comprising a carrier.
This invention also provides the above-described vaccines, further comprising an adjuvant.
This invention also provides the above-described vaccines, wherein the surface accessible amino acids comprise Lysine 432, Proline 369, and Threonine 373.
This invention further provides a vaccine comprising a polypeptide having 6 or more continuous surface accessible amino acids adjacent to the Phe 43 cavity of gpl20.
This invention also provides the above -described vaccines, wherein the polypeptide is or is part of a conserved neutralization epitope.
This invention further provides the above-described vaccines, further comprising a carrier.
This invention also provides the above-described vaccines, further comprising an adjuvant.
Table Summarizing the CCR5-binding residues of gpl20
SET A 117, 121, or 123
SET B 207
SET C 330
SET D 419, 420, 421, 422, 437, 438, 440, 441,
442, or 444 This invention further provides a method of inhibiting cell entry by HIV, comprising blocking or inhibiting the residues from 2 or more the sets of the CCR5 -binding residues set forth above, thereby inhibiting or preventing gpl20 from binding to CCR5 and thereby inhibiting cell entry by HIV.
This invention also provides the above described method wherein 3 or more the sets of the CCR5-binding residues set forth above are blocked or inhibited from interacting with CCR5.
This invention also provides the above described methods, wherein the blocking or inhibiting comprises contacting the CCR5-binding residues with an antibody.
Experimental Details
First Series of Experiments
Theoretical Analysis.
Much of the crystallization literature is anecdotal, reflective perhaps of diverse nature of proteins. If a particular protein fails to crystallize, one may be faced with a peculiar quandary: on one hand, a bewildering array of options which have been reported to work on at least one other, often quite different, protein; and on the other hand, no clear way of distinquishing which strategy is optimal.
Many variables affect crystallization, and it is impossible to optimize each of them. Rather, the question is how to stack the odds in one's favor. An apppropriate crystallization strategy may be merely a matter of finding the right focus : if a protein is only 30% pure, and consequently has virtually no chance of crystallizing, purification would be the key; if a protein is 98% pure, further purification would probably be ineffective. One important consideration, then, in evaluating different crystallization strategies is, given the particulars of the situation, the degree to which each strategy enhances the overall probability of crystallization. This probability enhancement is straighforward to calculate in some cases.
For example, with a strategy of screening ever larger arrays of crystallization conditions, if 80% of all proteins crystallize from a core set of 100 conditions (23), such a strategy could at most enhance the overall probability of crystallization by only 20%; after the first 100 or so conditions, further screening produces increasingly diminishing returns. By contrast, for the variational crystallization strategy employed here, the quantitative enhancement of crystallization probability is not immediately apparent .
Crystalline order is explicitly dependent on lattice homogeneity. Reducing heterogeneity can be thought of as increasing the proportion of surface area available for formation of lattice contacts, increasing the probability of crystallization. The probability that a single lattice contact between two molecules is homogeneous is in part related to the fraction of surface that is homogeneous on one molecule multiplied by the fraction homogeneous on the other, that is, to:
(% homogeneous surface molecule 1) x (% homogeneous surface molecule 2) .
This is equal to (% homogeneous surface molecule l)2 if molecule 1 and molecule 2 are the same.
Generalizing from two molecules, the overall crystallization probability is related to: Σ(H-δ)2*c, where the sum is over all possible lattices, "H" is the fraction of the surface which may form lattice contacts, "δ" is a function of the size of the lattice contact and the degree of surface homogeneity - related to the occlusion of available surface area upon formation of each lattice contact as well as the spatial distribution of homogenous surface over the molecular surface; and "C" is the number of unique contacts required to make a set of symmetry-related molecules into a crystal lattice
(different for each space group) . The observed average value of C (Cave) is -4.5 (24), with a minimum theoretical value for the most common space groups of 2 or 3 (24) .
C may be relatively small since lattice contacts often make up only a small proportion of a macromolecule surface, and considerable surface heterogeneity may be tolerated. Given a reduction in surface heterogeneity, what is the change in crystallization probability? Surface area is correlated with molecular weight (MW) by the power law: surface area = 6.3 MW0-73, which on average predicts surface area to within 4% for monomeric protein (25) . The fraction of homogeneous surface can thus be approximated as a ratio of molecular weights of the total and the homogeneous portion of the protein:
H « [MW(homogeneous) / MW (total) ] °-73
For the probability ratio of before (Pi) and after (Pf) reducing heterogeneity (approximating δ ~ 0),
Pf / ξ - Σ [ (MW(homogeneous) f / MW (totalj f] 1 ' 46*C / Σ
1 46*C [ (MW( omogeneous ) i / M (total ) i]
This equation is still not very useful since MW (homogeneous) is unknown and molecule-specific. In reducing heterogeneity, however, it seems reasonable to assume that the removed portion, if it is a highly branched carbohydrate or a proteolytically exposed region, is completely heterogeneous. In such cases,
[MW (homogeneous) f ~ MW (homogeneous) i] . Then, assuming that the summation is similar for both cases, the above equation reduces to:
Pf / P, ~ [ MW(total)i / MW (total) f] 1'46*Cave (for removed portion heterogeneous)
This last equation allows the change in crystallization probability upon heterogeneity removal be quantitated. Say for example, one recombinantly produces a protein of 110 amino acids, which include 10 flexible amino acids from the cloning vector. Is it worth removing the 10 amino acids? The above equation shows that removing the 10 amino acids would enhance the crystallization probability by:
Pf / Pi « [110 / (110-10) J1-46*4 5-i = 0.87 or about 90%
Another aspect of variational crystallization, the use of multiple variants of the same protein also increases the probability of crystal formation. In this case, the overall probability of crystallization is exponentially related to the number of variants. Assuming independence of variants (a reasonable assumption with different protein ligands; not as valid with minor changes) with n variants and a probability of crystallization for each variant of P, the overall probability Pτ is:
Pτ = 1-(1-P)n
For example, if each variant of a relatively heterogeneous protein has only a 25% chance of crystallizing, the overall probability is 1 - (1-0.25) n; with 15 variants, the probability increases to almost 99%.
In general, the enhancement in overall probability is given by the ratio of (Pτ / P) - 1. If one tries many variants, and (1-P)n is much smaller than 1, then
Pτ / P = (1 / P) - [(l-P)V P] - (1/P) (for n large)
and the enhancement is related to the initial probability of crystallizing a single variant. Thus the more difficult a protein is to crystallize, the more it benefits from this multiple variant strategy.
Gpl20 constructs. The various recombinant gpl20 glycoporteins used for crystallization trials were produced in stable Drosophila Schneider 2 producer lines under the control of an inducible promoter as previously described (20) (Table 1
TABLE 1 THE GP120 CON!
Construct gpl20-St Ln Amino Acids in Reference Construct*
Δ61-IIIB IIIB 62-511 (48)
ΔΔ3300--FFLL JJRRFFLL 31-511 (49)
ΔV1/2ΔV3 HxBc2 31-120 GAG 204-297 (41) GAG 330-511 ΔV1/2ΔV3ΔC5 HxBc2 31-120 GAG 204-297 (50)
GAG 330-492
Δ82ΔV1/2 HxBc2 83-127 GAG 195-297 (50)
*ΔV3ΔC5 GAG 330-492
Δ82ΔV1/2 HxBc2 83-127 GAG 195-302 (50] *ΔV3*ΔC5 GAG 325-492
* Sequence numbers refer to the translated gpl60, with the mature gpl20 beginning at +31. N-terminal sequencing showed that all constructs contained 4 additional amino acids, Gly-Ala-Arg-Ser, an artifact of the signal peptide cleavage. GAG here refers to the tripeptide, Gly-Ala-Gly, which was substituted for the removed amino acids.
Protein production and purification. The N-terminal two domains of CD4 (D1D2), residues 1-182, were produced in Chinese hamster ovary (CHO) cells and purified as described previously (21) . The human monoclonal antibodies, 17b, A32, CH and F105 (derived from HIV-1 infected individuals), and mouse monoclonal antibodies, L71 and 178.1, were purified by Protein-A affinity chromatography. Secreted gpl20 from Drosophila cells was purified by F105-Protein A affinity chromatography which used a glycine pH 2.8 elution step followed by immediate Tris base neutralization.
Protease Digestion. Fabs were produced by papain digestion of monoclonal antibodies. Briefly, the antibody was reduced in 100 mM DTT, 100 mM NaCl, 50 mM
Tris pH 8.0 for 1 hr at 37° C, and dialyzed (4° C) , first in phosphate-buffered saline (PBS) to reduce the DTT concentration to about 1 mM, then in alkylating solution (PBS titrated to pH 7.5 with 2 mM iodoacetamide, 48hr) , and subsequently in PBS without iodoacetamide. The reduced and alkylated antibody was concentrated to at least 2 mg/ml and digested with papain using the commercial protocol (Pierce) . An additional gel filtration chromatographic step on a Superdex S-200 column (Pharmacia, FPLC) was added to ensure oligomeric homogeneity.
The gpl20 proteins were subject to protease digestion, papain, elastase, and subtilisin (Boehringer Manneheim) to assay for proteolytic susceptibility. In these assays, the gpl20 concentration was kept constant and the protease diluted serially (3.3x) from a ratio of 1:10 to 1:1000. The digestion mix was incubated for 1 hr at 37° C and quenched by addition of 1% SDS (1:10 ratio) with immediate heating in boiling water for 2 minutes. Digestion products were analyzed with SDS- polyacrylamide gel electrophoresis (PAGE) with and without DTT reduction.
Carboxypeptidase Y digestion was used to analyze the C- terminus of gpl20. A 1:10 ratio of carboxylpeptidase Y (Boehringer Manneheim) to gpl20 was incubated for 1 hr at 37° C, pH 7.0. Even though digestion could not be easily seen by SDS-PAGE, the C-terminus of gpl20, HXBc2 strain, contains a number of positively charged amino acids, and the extent of the reaction could be monitored by native-PAGE.
Deglycosylation. Drosop ila-produced gpl20 proteins were deglycosylated enzymatically . Briefly, 0.5 mg/ml of gpl20 was incubated with various deglycosylating enzymes (singly or in combination) in 0.5 M NaCl, 100 mM Na Acetate, pH 5.7, for 10 hr at 37° C. Endoglycosidase D was used at a concentration of 0.1 U/ml, Endoglycosidase F at 0.25 U/ml, Endoglycosidase H at 0.25 U/ml, and Glycopeptidase F at 0.1 U/ml (all from Boehringer Manneheim) . For crystallization variants involving the CD4-gpl20 complex, the addition of D1D2 (which lacks carbohydrate) to the deglycosylation cocktail was found to enhance gpl20 solubility. The deglycosylation reactions were monitored by following the reduction in the molecular weight on SDS- polyacrylamide gel eletrophoresis (SDS-PAGE) . Deglycosylation was nearly complete within 30 min of incubation and the reactions appeared to plateau after 3 hr. The extent of deglycosylation was judged by a matrix-assisted desorption (MALDI) mass spectroscopy, carbohydrate analysis, affinity for concanavalin-A, and mobility and band width on SDS-PAGE. Protein aggregation was assayed by native-PAGE, dynamic light scattering, and gel filtration chromatography.
Monoclonal antibody binding assay. The various gpl20 glycoproteins were assessed for recognition by a variety of monoclonal antibodies directed against both linear and discontinuous gpl20 epitopes by either immunoprecipitation (31) or by ELISA (32) . The ELISA was performed with both fully glycosylated and deglycosylated ΔV1/2ΔV3 glycoproteins immobilized on ELISA plates using a capture antibody specific for the gpl20 carboxyl-terminus, 6205 (International Enzymes) (32) .
Binary and ternary complex purfication. To ensure proper stoichiometry and oligomeric homogeneity, all complexes were purified by gel filtration chromatography using a Superdex S-200 column (Pharmacia, FPLC) . This column exhibited good resolution with routine separation of samples that differed by only 30% in molecular weight. Individual components were first purified separately to ascertain their monomeric status. Complexes were then combined and repurified using the same column. A buffer of 0.35 M NaCl, 5 mM Tris/Cl pH 7.0, 0.02% NaN3 was used throughout. Peak fractions were concentrated over centricon-30 (Amicon) to a final protein concentration of -10 mg/ml and either aliquoted and stored at -80°C or used directly for crystallization.
Crystallization. The vapor diffusion hanging droplet technique was used for all crystallizations. Small volumes, 0.5 μl protein solution + 0.5 μl reservoir solution, were used for virtually all crystallizations, screenings as well as final optimizations.
Screening. The Crystal Screen I (Hampton Research) was used, augmented by roughly 20 conditions which tested high protein concentrations (vapor diffusion concentration of the protein at various pHs) as well as mixtures of organic additives (2-5% MPD, PEG 400, or PEG 4000) combined with high ionic strength (2-4 M NaCl, Am2S04 or Na/KζO ) at pH 5.5-9.5. For each gpl20 crystallization variant, a subset of 12 different conditons were analyzed in depth to establish the approximate precipitation point of the protein under a variety of different precipitants . The factorial solutions were then individually adjusted to target the observed precipitation point and a full screen of -70 conditions was set up at 20 °C. After at least one week of constant daily observation, screening solutions were recalibrated to account for the observed 20° C precipitation point and another full screen at 4° C was set up. If no crystals were observed, the Crystal Screen II (Hampton Research) was set up at 20° C.
Optimization. In addition to the standard single variable optimization of crystallization conditions, a factorial-like procedure was used to determine if small amounts of different additives increased crystals quality. Type E crystals were grown from the following conditions: Protein (Δ82ΔV1/2*ΔV3ΔC5 gpl20, two-domain CD4 (D1D2) , Fab 17b purified as a ternary complex on the Superdex S-200) ; Droplet (0.5 μl protein solution consisting of -10 mg/ml protein in gel filtration buffer + 0.4 μl droplet mix containing 0.1 M NaCitrate, 0.02 M NaHepes, 10% isopropanol, 8% PEG 5000 (Fluka) , 0.0075% SeaPrep Agarose (FMC BioProducts) , pH 6.4; Reservoir: (0.35 M NaCl, 0.1 M NaCitrate, 0.02 M Hepes, 10% isopropanol, 8% PEG 5000, pH 6.4) . The droplet mix was kept at 37°C to ensure the agarose solubility, and the crystallization set-up at room temperature. Clumps of crystals appeared within two weeks of incubation at 20 °C and grew for several months to maximal size.
X-ray diffraction characterization. All data were collected at Beamline X4A, Brookhaven National Laboratory. The type E crystals were crosslinked with the vapor diffusion technique of Lusty (33) by placing a crystallization bridge (Hampton Research) with a 25 μl sitting droplet of 1% glutaraldehyde (Sigma) in the reservoir of a standard hanging droplet vapor diffusion crystallization setup for 1 hr at room temperature. The crosslinked crystal was washed with stabilizer (reservoir solution with only 50 mM NaCl) containing 10% ethylene glycol. After approximately 24 hr, the external liquid surrounding the crystal was replaced with paratone-N (Exxon) , the crystal mounted in an ethylene loop (Hampton Research) (34), and flash-cooled in the nitrogen stream of a cryostat (details are provided in (35) ) . Oscillation data were processed with DENZO (36) and scaled with SCALEPACK (36) .
Experimental Results and Discussion
Lattice contacts are made solely at the molecular surface . Unlike small molecules, macromolecules have interiors -considerable surface, and hence crystallization, variability is tolerated while maintaining the same basic fold or even enzymatic abilities. A prescient example that pre-dates the powerful methods of modern molecular biology was John Kendrew' s screening of myoglobins from many different organisms until he found one, from sperm whale, that crystallized well (37) . Indeed, human myoglobin requires a Lys to Arg mutation in order to produce crystals suitable for structural analysis (38) . Conversely, one of the most well-ordered protein crystals, crambin, is actually a mixture of two isoforms with sequence variation at internal residues (39) .
To address the many problems associated with the crystallization of HIV-1 gpl20, we exploited the mutability of the macromolecular surface using strategies that involved protein modification and conformational restriction (See Table 2) . Several of these strategies contain novel features and are detailed here as well.
TABLE 2 CRYSTALLIZATION PROBLEMS , PROTEIN MODIFICATION SOLUTIONS AND ENHANCEMENT OF CRYSTALLIZATION PROBABILITY .
Problem Solution Probability Enhancement1
N- linked Protein production in an 1200% carbohydrate inducible Drosophila cell line coupled with deglycosylation with Endoglycosidases D and H
Surface loop Replacement of VI/V2 and V3 370! flexibility loops with the tripeptide linker, Gly-Ala-Gly
Conformational Conformation restriction heterogeneity with protein ligands such as CD4 and Fabs from conformationally sensitive monoclonal antibodies
Ν- and C- Mutational deletion and 50%' terminal proteolytic cleavage heterogeneity analysis coupled to the production of gpl20 with truncated Ν- and C-termini
The probability enhancement, [ (pf/P±) -1] , was calculated from the equation, ( [MW (total) i/MW (total) ]1-46*03^-!) with C=4.5 , the average observed contact number (see text) . For the drosophila produced HXBc2, the molecular weight for the glycosylated gpl20 is approximately 90kDa; the deglycosylated gpl20, 60 kDa; and the deglycosylated ΔV1/2ΔV3 gpl20, 47 kDa. The Ν-terminus is resistant to proteolysis from +39 to +82, and thus probably adopts an ordered conformation. This number was calculated assuming only the C-terminal 19 and the Ν-terminal 8 amino acids were disordered.
Variants of gpl20 were developed through an iterative cycle which strove to eliminate heterogeneity. The cycle involved recombinant production of gpl20 variants, deglycosylation, and then assessment of heterogeneity and flexibility by examination of glycosylation status, monoclonal antibody binding, and protease sensitivity, leading to the design of new constructs. For example. at the gpl20 C-terminus, protease digestion and native PAGE detected variability, and carboxyl peptidase Y digestion generated a 15-20 amino acid deletion which retained CD4 binding activity. A homogeneous product was difficult to make by this method, and primer-based PCR mutagenesis and recombinant expression were used to generate of a homogeneous gpl20 with a 19 amino acid C- terminal deletion. At the N-terminus, sequencing of the initial constructs showed the expected signal cleveage at +31, with four additional amino acids, Gly-Ala-Arg- Ser, added from the signal peptide. Protease digestion gave a product at +39 indicating flexibility in this region. Progressive genetic truncation and biochemical analysis identified +83 as a variant that was recognized by conformation-dependent gpl20 ligands, whereas +93 exhibited some conformational discruption (31) . Thus much of the apparently flexible region at the N-terminus of gpl20 could be removed without disrupting the global conformation of the protein.
To further reduce flexibity, variable loops, VI, V2 , and V3 were replaced. Little effect was found on CD4 binding (32,40,41). Three constructs were made which contained deletions of the VI, V2 , and V3 loops (Table 1) . With ΔV1/2ΔV3, the entire base and stem of the variable loops VI, V2 and V3 were excised. With ΔV1/2*ΔV3, the conserved stem of the V1/V2 stem-loop structure was retained, restoring the CD4-induced antibody epitopes in the presence of soluble CD4. With ΔVl/2*ΔV3* the base of the V3 loop was retained as well, fully restoring CD4-induced antibody epitopes, even in the absence of soluble CD4.
For the Asparagine-linked carbohydrate, Dionex chromatography of fully proteolyzed gpl20 showed that the carbohydrate on the Drosophila-expressed protein consisted of (N-acetyl-glucosamine) 2 (fucose) F (mannose) M, with F, 0-1, and M, 3-9. Deglycosylation with enzymes such as Glycopeptidase F (or Endoglycosidase F at pH 5.0), which cleave the glycosidic linkage and convert the N-linked asparagine into an aspartic acid, resulted in gpl20 aggregation, although it remained soluble. Cleavage of the 1-4 /3-bonds in the chitobiose core with Endoglycosidases D or H, leaving only a single N-acetyl- glucosamine and potentially a 1-6 fucose attached to any of the glycosylated asparagine residues, appeared to leave the protein intact as judged by a panel of conformationally sensitive monoclonal antibodies (32). Digestion of full-length constructs with Endoglycosidase H, which has specificity for oligosaccharides with 5-9 mannoses, removed roughly 60% of the carbohydrate, and addition of Endoglycosidase D, which cleaves oligosaccharides with 3 or 4 mannoses, removed up to 90% of the carbohydrate .
For the loop-deleted constructs, all mannose were removed with the Endoglycosidase D/H combination as judged by the inability of concanavalin A to bind to the deglycosylated protein. Mass spectroscopy of the deglycosylated Δ82ΔV1/2*ΔV3ΔC5 gpl20 showed a molecular weight of 39,000+/- 50 D consistent with a mass of 35.4 kD for the protein and 3.6 kD for the remaining carbohydrate. Carbohydrate analysis showed only fucose and N-acetyl-glucosamine to be present, in a ratio of 1:3.05 ± 0.02, respectively. For the 18 potential asparagine glycosylation sites in the Δ82ΔVl/2*ΔV3ΔC5 gpl20, these results are consistent with 5 unused, 9 with N-acetyl-glucosamine and 4 with N-acetyl- glucosamine (1-6) fucose.
Protein ligands, CD4 and the Fabs of monoclonal antibodies, were used in an attempt to reduce overall surface, and hence potential crystal lattice, mobility. This was complicated by the internal mobility of these ligands: CD4 has a flexible juncture between the second and third extracellular domains (42) , and Fabs have a conformationally mobile "elbow bend" between their variable and constant domains (43) . For CD4 , we used a construct containing only the N-terminal two domains (1- 182) , for which there was previous success in structure determination (14) . For the monoclonal antibodies, we limited the crystallization screens to using only one Fab at a time, even though combinations with multiple Fabs were possible.
Initial trials with Fab 178.1, which recognizes a linear epitope in V3 of both free- and CD4-bound gpl20 (29) , gave only crystalline precipitate at best. On the CD4 side, we tested the Fab of L71, which recognizes the CDR3-like domain DI (30) , but had difficulties preparing ternary complexes, probably due to a destabilization of the CD4-gpl20 interation. Subsequently, we focused on antibodies with discontinuous epitopes, which were more likely to recognize conformationally rigid portions of gpl20. Complexes of gpl20 proteins with Fabs of CH, which recognizes an epitope spanning Cl and C5 (27) , and
F105 whose epitope lies within C2 , C3 , C4 , and C5
(overlapping the CD4 binding site) (28) gave only poor crystals (Table 3) . We had greater success with the monoclonal antibody 17b, which not only recognizes a discontinuous epitope but discriminates between different conformational states of gpl20 (44) . The Fab of 17b did not bind the initial ΔV1/2ΔV3 gpl20 construct, and required the restoration of the stem of the V1/V2 loop (constructs ΔV1/2*ΔV3 or ΔVl/2*ΔV3*) .
We screened 18 different combinations of gpl20 variants and ligands (Table 3) using a limited factorial-based crystallization screen. Factorial screening was originally devised as a method for deducing the essential crystallization factors from combinations of different conditions (45) . The empirical observation, however, that most crystallizable macromolecules are able to crystallize from a limited set of common conditions, has validated an entirely different process: crystallization screening with a small but diverse collection of fixed conditions (23) . A high probability of success has been reported with as few as 6 different conditions at 4 different concentrations (39) , and commercial kits are available with 50-100 conditions (Hampton Research) .
TABLE 3 SUMMARY OF HIV-1 GP120 CRYSTALLIZATION ATTEMPTS
HIV-l gpl20 Glycosylation1 Cofactors2 Comments Construct Status
Δ61-IIIB glycosylated precipitate
(IIIB looks bad strain)
-60% precipitate deglycosylated looks bad
-90% precipitate deglycosylated looks better but but some Asn to not good
Asp
-60% D1D2 sCD4 precipitate deglycosylated looks ok
-60% Fab 178.1 precipitate deglycosylated looks ok
-60% D1D2 sCD4 and precipitate deglycosylated Fab 178.1 looks ok
-90% precipitate deglycosylated looks ok
Δ30-FL -90%
(JRFL deglycosylated precipitate strain) looks ok
D1D2 SCD4 good looking precipitate
Fab 178.1 cystalline precipitate
D1D2 sCD4 and good looking Fab 178.1 precipitate -- no crystals
ΔV1/2ΔV3 fully good looking deglycosylated precipitate --
(HXBc2 no crystals strain)
D1D2 SCD4 very small nice looking crystals in PEG
400 (Crystal Type A) badly formed
D1D2 sC3D4 and crystals from Fab Cll (NH4)2S04 (Crystal
Type B)
ΔV1/2ΔV3ΔC5 fully D1D2 sCD4 spheroidal
(HXBC2 deglycosylated crystals in PEG strain) 4000
(Crystal Type C) nice precipitate
Fab F105 no crystals
Figure imgf000110_0001
to the percent of N- linked sites cleaved by endoglycosidase D or H. Thus the "fully deglycosylated" protein still contains N-acetyl glucosamine and fucose moieties.
D1D2 sCD4 refers to two-domain soluble CD4. Antibody epitopes are described in the text.
In conjunction with the limited crystallization screen, small volume droplets were used, typically 0.5 μl of protein per crystallization trial. With small volumes, only 1-2 mg of protein were sufficient to evaluate each gpl20 crystallization variant. Smaller volumes were found to be more efficient at nucleation than larger droplets, perhaps due to higher surface tension effects resulting in greater variations in precipitant concentration, thereby permitting each droplet to sample a wider range of precipitant concentrations. Indeed, droplets that were "spread-out" also showed enhanced nucleation. This explanation may also account for the well-known observation that crystals frequently nucleate from the edges of crystallizaton droplets.
The initial crystallization screens produced six different types of crystals (Table 4) For crystal types A-D, extensive optimization was unable to produce single crystals large enough to be characterized. For crystal types E and F, single crystals of needle morphology could be grown. With type E crystals, the needle axis was coincident with the a axis, with the cross-section perpendicular to the needle axis a rhombehedron bounded by faces of the form (0 1 1) and (0-1 1) . These could be distinguished from type F crystals, where the cross- section was hexagonal. Single crystals of type E and F were analyzed for diffraction in capillary mounts. Only type E crystals showed diffraction. Gel electrophoresis of these crystals demonstrated that they contained gpl20, D1D2 and Fab 17b (Figure 4) .
TABLE 4 CRYSTALLIZATION CONDITIONS FOR INITIAL GP120 CRYSTALS
Figure imgf000112_0001
All binary and ternary complexes were purified by gel filtration. D1D2 sCD4 refers to the two domain soluble CD4. ** The protein concentration is given as the absorbance (280 nm) of the complex per ml of solution. *** Most of the reservoirs are conditions from Crystal Screen 1 (Hampton Research) ; the reagent numbers given here refer to the crystallization reagent from this commercial kit. Hanging droplets were 0.5 μl protein (in 0.35 M NaCl, 5 mM Tris pH 7.0, 0.02% NaN3) + 0.5 μl reservoir, except for crystal type B, which used 0.5 ul of 3 -fold diluted reservoir. Crystallization reservoirs were 500 μl; an additional 35 ul of 5 M NaCl was added after the droplet was mixed to compensate for the NaCl in the protein solution. All dilutions used H20, except for crystal type F, where 22.5% isopropanol was used. Crystallizations were setup at room temperature and incubated at 20 °C.
Growth of single crystals of type E required the addition of small amounts of agarose. SeaPrep, with a gelling point near room temperature, gave the best results. Extensive crystallization optimization failed to produce large single crystals. Despite considerable effort, the best typical crystals were rods with a cross-section of only 30 x 40 μm. A closely related crystallization variant, which retained 10 additional amino acids in the stem of the V3 loop failed to crystallize .
We were unable to flash-cool the type E crystals with standard cryoprotectants . Satisfactory results were found with a procedure that (i) fortified the crystals with vapor-diffusion glutaraldehyde crosslinking, (ii) permeated the crystals with 10% ethylene glycol and (iii) used an immiscible oil, paratone-N, to replace the external solution around the crystals prior to flash- cooling (33, 35). Cryopreserved crytals diffracted to Bragg spacings of better than 2 A, although the diffraction was slightly anisotropic, with higher mosaicity along the 88 A jb-axis. Data to 2.2 A have been collected (Table 5) . TABLE 5 DATA COLLECTION STATISTICS FOR CRYSTALS OF THE TWO-DOMAIN CD4(D1D2) / FAB 17B / Δ82ΔV1/2*ΔV3ΔC5 GP120 COMPLEX
αrange # R sym Completeness
(A) reflections (%) (%) (unique) all data 20-2.2 56,195 14.5 87.4 last shell 2.48-2.2 13,928 35.5 " 73.1 Our success with gpl20 demonstrates the power of variational crystallization. We have derived equations that quantitate the effect of this strategy on the overall probability of crystallization and have calculated these for several of the biochemical and molecular biological manipulations employed in this study. As can be seen (Table 2) , the probability of crystallization may be strongly influenced by reducing molecular surface heterogeneity. The influence of using multiple variants is more difficult to quantitate since it is dependent on the probability of crystallization for each variant. Nonetheless, our theoretical analysis shows that the effect of multiple variants is greatest for proteins least likely to crystallize.
The crystallization literature is replete with examples of protein manipulation, from proteolytic digestion, to variation in solvating detergent, to screening of DNA oligonucleotides (47) . What distinquishes our efforts is the derivation of a theoretical foundation which rationalizes our approach: a comprehensive focus on surface modification to eliminate heterogeneity and to present new crystallization variants coupled to a limited screen of crystallization conditions. The types of crystallization problems embodied in gpl20 (Table 2) are not so different from many of typical problems facing present day crystallographers; both from a theoretical or from a practical perspective, the strategy of variational crystallization used here should be broadly applicable.
Table 6, which follows, shows the critical residues in gpl20 for interactions with CD4. Table 6. Critical Residues in GP120 for Interactions with CD4
o
CO i rπ zn
CZ i— rπ ro
Figure imgf000115_0003
Variability in parenthesis (0, 1 little variability), followed by known other residues in variants.
Figure imgf000115_0002
Figure imgf000115_0001
References for the First Series of Experiments
1. Kelders, H. A., Kalk, K. H. , Gros, P., and . G., H.
(1987) Protein Eng. 1, 301-303.
2. Morris, D. W. , Kim, C. Y., and McPherson, A. (1989) Biotechniques 1 , 522-527.
3. Gallo, R. C, Salahuddin, S. Z., Popovic, M. , Shearer, G. M., Kaplan, M., Haynes, B. F., Palker,
T. J., Redfield, R., Oleske, J., Safai, B . , White, G., Foster, P., and Markham, P. D. (1984) Science 224, 500-503.
4. Barre-Sinoussi, F., Chermann, J. C, Rey, F., Nugeyre, M. T., Charmaret, S., Gruest, J. , Dauguet, C. , Axler-Blin, C, Vezinet-Brun, F., Rouzioux, C, Rosenbaum, W. , and Montagnier, L. (1983) Sci ence 220, 868-871.
5. Dalgleish, A. G. , Beverly, P. C. L. , Clapham, P. R. , Crawford, D. H., Greaves, M. F., and Weiss, R. A. (1984) Na ture 312, 763-767.
6. Klatzmann, D., Champagne, E., Charmaret, S., Gruest, J. , Guetard, D., Hercend, T., Gluckman, J. C, and Montagnier, L. (1984) Na ture 312, 767-768.
7. Zhang, L., Huang, Y. , He, T., Cao, Y. , and Ho, D. D. (1996) Na ture 383, 768.
8. Feng, F., Broder, C. C, Kennedy, P. E., and Berger, E. A. (1996) Sci ence 272, 872-877.
9. Dragic, T., itwin, V., Allaway, G. P., Martin, S. R., Huang, Y., Nagashima, K. A., Cayanan, C, Maddon, P. J., Koup, R. A., Moore, J. P., and Paxton, W. A. (1996) Na ture 381, 667-673.
10. Deng, H., Liu, R. , Ellmeier, W., Choe, S., Unutmaz, D., Burkhart, M. , Di Marzio, P., Marmon. S., Sutton, R. E., Hill, C. M., Davis, C. B., Peiper, S. C,
Schall, T. J. , Littman, D. R.,._ and Landau, N. R. (1996) Na ture 381. 661-666.
11. Choe, H. , Farzan, M. , Sun, Y. , Sullivan, N. , Rollins, B., Ponath, P. D., Wu, L., Mackay, C. R.,
LaRosa, G. , Newman, W. , Gerard, N. , Gerad, C, and Sodroski, J. (1996) Cell 85, 1135-1148.
12. Alkhatib, G., Combadiere, C, Broder, C. C, Feng, Y., Kennedy, P. E., Murphy, P. M. , and Berger, E. A.
(1996) Science 272, 1955-1958.
13. Wang, J. H. , Yan, Y. W. , Garrett, T. P., Liu, J. H. , Rodgers, D. W. , Garlick, R. L. , Tarr, G. E., Husain, Y., Reinhertz, E. L., and Harrison, S.C. (1990)
Na ture 348, 411-418.
14. Ryu, S.-E., Kwong, P. D., Truneh, A., Porter, T. G. , Arthos, J., Rosenberg, M. , Dai, X., Xuong, N. , Axel, R., Sweet, R. W. , and Hendrickson, W. A. (1990)
Na ture 348, 419-426.
15. Zhang, X., Gaubin, M., Briant, L., Srikantan, V., Murali, R. , Saragovi , U. , Weiner, D., Devaux, C, Autiero, M. , Piatier-Tonneau, D., and Greene, M.I.
(1997) Na t . Biotechnol . 15, 150-154.
16. Jarvest, R. , A. L., B., Edge, C. M., Chaikin, M. A., Jennings, L. J., Truneh, A., Sweet, R. W. , and Hertzberg, R. P. (1993) Bioorg. Med. Chem. 3, 2851-
2856. 17. Chen, S., Chrusciel, R.S., Nakanishi, H., Rakabutr, A., Johnson, M. E., Sato, A., Weiner, D. , Hoxie, J., Saragovi, H.U., Greene, M.I., and al . , e. (1992) Proc . Na tl . Acad . Sci . USA 89, 5872-5876.
18. Myers, G., Wain-Hobson, S., Henderson, L., Korber, B., Jeang, K.-T., and Pavlakis, G. (1994), Los Alamos National Laboratory, Los Alamos, New Mexico
19. Leonard, C. K. , Spellman, M. W. , Riddle, L., Harris, R. J., Thomas, J. N. , and Gregory, T. J. (1990) J". Biol . Chem 265, 10373-10382.
20. Starcich, B. R., Hahn, B. H., Shaw, G. M., McNeely, P. D., Modrow, S., Wolf, H., Parks, W. P., Josephs, S. F., Gallo, R. C, and Wong-Staal, F. (1986) Cell 45, 637-648.
21. Sattentau, Q. J., Moore, J. P., Vignaux, F., Traincard, F., and Poignard, P. (1983) J. Virol 64, 7383-7393.
22. Thali, M., Moore, J. P., Furman, C, Charles, M. , Ho, D. D., Robinson, J., and Sodroski, J. (1993) J.
Virol . 67, 3978-3988.
23. Jancarik, J. , and Kim, S. H. (1991) J. Appl . Crys t . 24, 409-411.
24. Wukovitz, S. W. , and Yeates, T. O. (1995) Na ture Struct . Biol . 2, 1062-1067.
25. Miller, S., Janin, J., Lesk, A.M., and Chothia, C. (1987) J. Mol . Biol . 196, 641-656.
26. Cherbas, L., Moss, R., and Cherbas, P. (1994) Methods Cell Biol. 44, 161-179.
27. Moore, J. P., and Sodroski, J. (1996) J Virol. 70, 1863-1872.
28. Thali, M. , Olshevsky, U. , Furman, C, Gabuzda, D., Posner, M., and Sodroski, J. (1991) J Virol. 65, 6188-6193.
29. Langedijk, J. P., Back, N. K. , Kinney-Thomas , E., Bruck, C, Francotte, M., Goudsmit, J. , and Meloen, R. H. (1992) Arch Virol. 126, 129-146.
30. Truneh, A., Buck, D., Cassatt, D. R., Jusczak, R. , Kassis, S., Ryu, S.-E., Healey, D., Sweet, R. , and
Sattentau, Q. J. (1991) J. Biol. Chem. 266, 5942- 5948.
31. Wyatt, R., Desjardin, E., Olshevsky, U. , Nixon, C, Binley, J. M. , Olshevsky, V., and Sodroski, J.
(1997) J". Virol, in the press.
32. Binley, J. M. , Wyatt, R. A., Desjardins, E., Kwong, P. D., Hendrickson, W. A., Moore, J. P., and Sodroski, J. (1997) AJDS -Research Hum. Retroviruses in the press.
33. Lusty, C. J. (1997) sumbitted.
34. Teng, T. Y. (1990) J". Appl. Crystallogr. 23, 387- 391.
35. Kwong, P. D., and Liu, Y. (1997) submitted
36. Otwinowski, Z., and Minor, W. (1997) Methods Enymol.
276, 307-326. 37. Kendrew, J. C, and Parrish, R. G. (1956) Proc. Roy. Soc. , A 238, 305-324.
38. Hubbard, S. R., Hendrickson, W. A., Lambright, D. G., and Boxer, S. G. (1990) J". Mol. Biol. 213, 215-
218.
39. Teeter, M. M. , and Hendrickson, W. A. (1979) J". Mol. Biol. 127, 219-223.
40. Pollard, S. R. , Rosa, M. D., Rosa, J. J. , and Wiley, D. C. (1992) EMBO J. 11, 585-591.
41. Wyatt, R., Sullivan, N. , Thali, M. , Repke, H. , Ho, D., Robinson, J., Posner, M. , and Sodroski, J.
(1993) J. Virol. 67, 4557-4565.
42. Wu, H., Kwong, P. D., and Hendrickson, W. A. (1997) Nature 387, 527-530.
43. Lesk, A. M., and Chothia, C. (1988) Nature 335, 188- 190.
44. Thali, M. , Moore, J. P., Furman, C, Charles, M., Ho, D. D., Robinson, J. , and Sodroski, J. (1993) J".
Virol. 67, 3978-3988.
45. Carter, C. W. J. , and Carter, C. W. , (1979) J. Biol. 254, 12219-12223.
46. Stura, E. A., Νemerow, G. R. , and Wilson, I. A. (1991) in Freiburg Macromolecular Crystallization
Meeting, pp. 1-12, Journal of Crystal Growth.
47. Ducruix, A., and Giege, R. (1992) Crystallization of nucleic acids and proteins. The Practical approach series., Oxford Univ. Press. 48. Culp, J. S., Johansen, H., Hell ig, B., Beck, J., Matthews, T. J. , Delers, A., and Rosenberg, M. (1991) Biotechnology 9 , 173-177.
49. Ivey-Hoyle, M. , Culp, J. S., Chaikin, M. A., Hellmig, B. D., Matthews, T. J.y and Sweet, R. W. (1991) Proc . Na tl . Acad . Sci USA 88, 512-516.
50. Wu, L., Gerad, N. P., Wyatt, R. , Choe, H., Parlin, C, Ruffing, N. , Borsetti, A., Cardoso, A. A.,
Desjardin, E., Newman, W. , Gerard, C. , and Sodroski, J. (1996) Nature 384, 179-183.
Second Series of Experiments
The human immunodeficiency viruses (HIV-1 and HiV-2) and simian immunodeficiency viruses (SIV) are the etiologic agents of acquired immunodeficiency syndrome (AIDS) in their respective human and simian host (1) . Typically, infection with primate immunodeficiency viruses is characterized by an initial phase of high-level viremia, followed by a long period of persistent virus replication at a lower level (2) . Viral persistence occurs despite specific antiviral immune responses, which include the generation of neutralizing antibodies.
The primate immunodeficiency viruses, like all retroviruses, are surrounded by an envelope consisting of a host cell-derived lipid bilayer and virus-encoded envelope glycoproteins (3) . To enter host cells, the viral membrane must be fused with the plasma membrane of the cell, a process mediated by the envelope glycoproteins. The exposed location of these proteins on the virus allows them to carry out their function but also renders them uniquely accessible to neutralizing antibodies. Thus, dual selective forces, virus replication and immune pressure, have shaped the evolution of the envelope glycoproteins and continue to do so within each infected host. Below summarized the current understanding of the functional features of these proteins .
Synthesis and assembly of the envelope glycoproteins.
In the infected cell, the envelope glycoproteins are synthesized as approximately 845-870 amino acid precursor in the rough endoplasmic reticulum. N- linked, high- mannose sugar chains are added to form the gpl60 glycoprotein, which assembles into oligomers (4-6) . The preponderance of evidence suggests that these oligomeric complexes are trimers (4,5) . The gpl60 trimers are transported to the Golgi apparatus, where cleavage by a cellular protease generates mature envelope glycoproteins: gpl20, the exterior envelope glycoprotein, and gp41, the transmembrane glycoprotein (3) . The gp41 glycoprotein possesses an ectodomain that is largely responsible for trimerization (7) , a membrane-spanning anchor, and a long cytoplasmic tail. Most of the surface-exposed elements of the mature, oligomeric envelope glycoprotein complex are contained on the gpl20 glycoprotein. Selected, presumably well-exposed, carbohydrates on the gpl20 glycoprotein are modified in the Golgi apparatus by the addition of complex sugar (6) . The gpl20 and gp41 glycoproteins are maintained in the assembled trimer by non-covalent , somewhat labile interactions between the gp41 ectodomain and discontinuous structures composed of N- and C-terminal gpl20 sequences (8) . Upon reaching the infected cell surface, a fraction of these envelope glycoproteins complexes are incorporated into budding virus particles. A large number of the complexes disassemble, releasing gpl20 and exposing the previously buried gp41 ectodomain. These events contribute tot he formation of defective virions, which predominate in any retroviral preparation
(9) •
Binding of the envelope glycoproteins to the CD4 receptor.
Many cell surface proteins, including adhesion molecules, are incorporated into HIV-1 virions along with the envelope glycoprotein complexes (10) . These host cell- derived molecules can assist the attachment of viruses to potential target cells. Virus attachment also involves the interaction of the gpl20 envelope glycoproteins with specific receptors, the CD4 glycoprotein (11) and members of the chemokine receptor family (12, 13) (Fig. 6) . The CD4 glycoprotein is expressed on the surface of T lymphocytes, monocytes, dendritic cells, and brain microglia, the main target cells for primate immunodeficiency virus in vivo. The requirement for CD4 binding exhibited by most primate immunodeficiency viruses for efficient entry is consistent with this observed in vivo tropism. A major function of CD4 binding is to induce conformational changes in the gpl20 glycoprotein that contribute to the formation and/or exposure of the binding site for the chemokine receptor (13, 14). Some HIV-1 and HIV-2 isolates cultured in the laboratory, as well as several primary SIV isolates, no longer depend upon CD4 for efficient entry, and bind to chemokine receptors but not CD4 for entry (16) , raise the distinct possibility that the chemokine receptors represent the primordial, obligate receptors for this retroviral lineage. The use of CD4 as a receptor may have evolved subsequently, allowing the high-affinity chemokine receptor-binding site of primate immunodeficiency viruses to be sequestered from host immune surveillance.
Multiple approaches have yielded insights into the structural basis for CD4 -binding by the primate immunodeficiency virus gpl20 glycoproteins. Early comparisons of gpl20 sequences revealed the existence of five variable (V1-V5) regions interspersed with five conserved regions (17) . Intramolecular disulfide bonds in the gpl20 glycoprotein result in the incorporation of the first four variable regions into large, loop- like structures (6) . Antibody binding studies and deletion mutagenesis have indicated that the major variable loops are well-exposed on the surface of the gpl20 glycoprotein
(18, 19) . The more conserved regions fold into a gpl20 core which has been recently crystallized in a complex with fragments of CD4 and a neutralizing antibody (20) . The gpl20 core is composed of two domains, an inner domain and an outer domain (Fig. 7a) . These names reflect the likely orientation of gpl20 in the assembled envelope glycoprotein trimer: the inner domain faces the tri er axis and, presumably, gp41, while the outer domain is mostly exposed on the surface of the trimer. Elements of both domains contribute to CD4 binding. CD4 binds in a recessed pocket on gpl20, making extensive contact over approximately 800 A°2 of the gpl20 surface. Two cavities are evident in the gpl20-CD4 interface. A shallow cavity is filled with water molecules, while a deep cavity extends 10-15 A° into the interior of gpl20. The opening of this deep cavity is occupied by phenylalanine 43 of CD4 , which has been shown by mutagenic analysis to be critical for gpl20 binding (21) . Most of the gpl20 residues previously identified as important for CD4 binding (22,23) surround the opening of the deep cavity and contribute to interactions with phenylalanine 43 of CD4. In addition, aspartic acid 368 of gpl20 forms a salt bridge with arginine 59 of CD4 , also shown by mutagenesis to be important for gpl20 binding (21) . Additionally, mainchain atoms on gpl20 and CD4 form hydrogen bonds bridging the two proteins. The formation of the deep cavity in gpl20 likely contributes to the transmission of CD4-induced conformational changes to gpl20 elements involved in the interaction with chemokine receptors and/or gp41. The deep cavity may be a useful target for intervention by small molecular weight compounds .
Chemokine receptor binding
Most primary, clinical isolate of primate immunodeficiency viruses use the chemokine receptors CCR5 for entry (12) . For most HIV-1 isolated that are transmitted and that predominate during the early years of infection, CCR5 is an obligate coreceptor, and rare individuals that are genetically deficient in CCR5 expression are relatively resistant to HIV-1 infection (24) . HIV-1 isolates arising later in the course of infection often-use other chemokine receptors, frequently CXCR4, in addition to CCR5 (12,24). Studies of chimeric envelope glycoproteins demonstrated that the third variable (V3) loop of gpl20 is a major determinant of chemokine receptor choice (12,25). V3-deleted versions of gpl20 do not bind CCR5, even though CD4 binding occurs at wild-type levels (14) . Antibodies against the V3 loop interfere with gpl20-CCR5 binding (14) . These results support an involvement of the V3 loop in chemokine receptor binding. Other, conserved gpl20 structures also appear to play an important role in chemokine receptor binding. The use of CCR5 by a diverse group of immunodeficiency viruses with divergent V3 sequences, first suggest the involvement of more conserved gpl20 elements (26) . Antibodies that recognize conserved, discontinuous gpl20 epitopes that are more exposed after CD4 binding are potent inhibitors of gpl20-CCR5 interaction (14) . These CD4-induced (CD4i) epitopes are discussed further below. Recent mutagenic and structural analysis have revealed the existence of a highly conserved gpl20 structure that is important for CCR5 binding (20,27) (Fig. 7, a and b) . This structure is adjacent to the V3 loop and the CD4i epitopes, and is oriented to face the target cell upon gpl20-CD4 binding.
gp41-mediated membrane fusion.
It is likely that the interaction of the gpl20-CD4 complex with the appropriate chemokine receptor promotes additional conformational changes in the envelope glycoprotein complex. By analogy with the influenza hemagglutinin, it has been suggested that the HIV-1 gp41 ectodomain undergoes major conformational changes during virus entry (28) . The proposed result of these changes is the insertion of the hydrophobic gp41 amino terminus (the "fusion peptide") into the membrane of the target cell. Mutagenic analysis (23,29) and the recently determined crystal structures of HIV-1 gp41 ectodomain fragments (5) are consistent with this model. The gp41 ectodomain structures reveal an extended, trimeric coiled coil that could potentially bridge the viral and target cell membranes (5) . Interactions of other gp41 helical segments near the membrane-spanning region with the interhelical grooves of the internal coiled coil are important for fusion-related conformational changes in gp41. This interaction can be inhibited by helical peptides that mimic either of the involved gp41 helices
(30) and is a potential target for future intervention with small molecular weight compounds.
The HIV-1 envelope glycoproteins as antigens.
The exposure of the primate immunodeficiency virus envelope glycoproteins on the surface of virions or infected cells makes them prime targets for antibodies that potentially block key functions of these proteins. However, the success of these viruses in achieving persistent infections implies that the viral envelope glycoproteins have evolved to be less-than-ideal immunogens and antigens. Structures on the viral envelope glycoproteins that are conserved among diverse viral strains are, in general, poorly exposed to the humoral immune system. The conserved gpl20 surfaces involved in binding to its three minimally polymorphic ligands, gp41, CD4 and chemokine receptors, each exhibit particular problems with respect to the elicitation of sensitivity to neutralizing antibodies. The moieties involved in gpl20-gp41 association are buried in the interior of the functional envelope glycoprotein spike (18, 31, 32) . The CD4 binding sites is recessed, flanked by variable regions exhibiting considerable glycosylation
(19,20). The chemokine receptor-binding site is masked by variable loops, probably V3 and V2 (20,32,33) (Figure
7c) . Even in the relatively conserved HIV-1 gpl20 core that has been structurally analyzed, the outer domain exhibits a variable, heavily glycosylation surface (20) . Since most carbohydrate moieties may appear as "self" to the immune system, this concentrated glycosylation may reduce the potential of a large portion of the gpl20 surface to serve as an immunogenic target .
Despite the potential to exert potent antiviral effects, antibodies are not able to suppress virus replication completely in infected hosts. The efficacy of the humoral immune response in limiting virus spread in vivo is compromised by at least two factors: 1) the relative resistance of primary virus isolates to neutralization; and 2) the temporal pattern with which neutralizing antibodies are generated.
Decreased neutralization sensitivity of primary HIV-1 isolates .
HIV-1 viruses that have been passaged in immortalized cell lines are typically more sensitive to neutralization by antibodies or soluble CD4 than are primary, clinical isolates (34) . Although other envelope glycoprotein regions can influence this phenotype, a major determinant is the structure of the gpl20 major variable loops, V1/V2 and V3 (35) . Thus, replacement of the V1/V2 and V3 variable loops of a laboratory-adapted virus with those of a neutralization-resistant primary isolate creates a virus similar to the parental primary virus (35) . The basis for the decreased sensitivity of primary HIV-1 isolates to neutralization appears to involve a decreased exposure of the relevant gpl20 epitopes to soluble CD4 or antibody. This decrease is most apparent in the context of the assembled oligomeric complex (36) . A likely explanation for this neutralization resistance is that the major variable loops of primary viruses assume tightly interfacing, "closed" conformations that decrease the accessibility of many gpl20 epitopes to antibodies.
The temporal pattern of the antibody response to HIV-1 infection. The noncovalent nature of the association between gpl20 and gp41 contributes to the lability of the functional envelope glycoprotein trimer (8,9). During natural infections, disassembled envelope glycoproteins apparently elicit most of the antibodies directed against these viral components. The interactive regions of gpl20 and gp41 are particular immunogenic (37) . However, since the cognate antibodies cannot bind the assembled, functional envelope glycoprotein complex, they do not exhibit neutralizing activity. Thus, although antibodies against the envelope glycoproteins typically can be detected in the sera of HIV-1-infected individuals by two-three weeks after infection, most of these antibodies lack the ability to inhibit virus infection. By the time that neutralization antibodies are efficiently elicited, HIV-1 is firmly established in the host.
Several weeks after virus infection, usually after the initial high level of viremia has subsided, neutralizing antibodies can be detected in the sera of infected animals or humans (38) . These antibodies neutralize the infecting virus but often exhibit little of no activity against other stains of virus. A subset of these strain- restricted antibodies recognize the HIV-1 V3 loop (38) . These antibodies can block chemokine receptor binding
(14) . Other variable gpl20 elements can contribute to the epitopes recognized by the strain-restricted neutralizing antibodies. It is known, for example, that antibodies directed against the gpl20 V2 loop can also exhibit neutralizing activity (39) . The V2 loop- associated neutralization epitopes are typically conformation-dependent. The ability of some V2-or V3- directed antibodies to recognize more than one HIV-1 strain (39,40) suggests that these major variable loops assume a finite number of conformations. This is consistent with the functional consequences on virus entry of some changes in these variable structures (41) , and with the observation that amino acid substitutions in the variable loops are not random (42) . The requirement for chemokine receptor binding probably constrains V3 loop variation. The V2 loop, although dispensible for the replication of some HIV-1 viruses in culture (33) , helps protect the V3 loop and the conserved epitopes near the chemokine receptor binding site from neutralizing antibodies. Thus, the V2 and V3 loops reside proximal to the chemokine receptor binding site (Fig. 7) , masking more conserved gpl20 elements and presenting potentially variable epitopes to the immune system.
Later in the course of HIV-1 infection of humans, antibodies capable of neutralizing a wider range of HIV-1 isolated appear (43) . A subset of the broadly reactive neutralizing antibodies, found in most HIV-1 infected individuals, interferes with the binding of gpl20 and CD4 (43) . Human monoclonal antibodies derived from HIV-1 infected individuals have been identified that recognize the gpl20 glycoproteins from a diverse range of HIV-1 isolates, that block gpl20-CD4 binding, and that neutralize virus infection (44) . The discontinuous epitopes (the so-called CD4BS epitopes) recognized by many of these human monoclonal antibodies have been characterized by mutagenic analysis (45) . The gpl20 residues important for antibody binding are all located within the CD4 -binding pocket on gpl20 (Fig. 7b) , and several of the most important residues are near the opening of the deep cavity (20) . Therefore, some broadly neutralizing antibodies can apparently access the more recessed elements of the CD4 binding pocket. This is consistent with the observation that the gpl20-CD4 interface is as large as that of a typical antibody- antigen complex (20) .
A second group of neutralizing antibodies found in a smaller number of HIV-1-infected humans is directed against the CD4-induced (CD4i) epitopes (46) . The CD4i epitopes are located near conserved gpl20 structures important for chemokine receptor interaction (14) (Fig. 7b) . CD4 binding has been shown to cause a change in the V2 loop conformation that allows better CD4i epitope exposure (33) . In the absence of CD4 , the antibodies recognizing the CD4i epitopes must bypass the overlapping V2 and V3 loops (33) . Indeed, as is evident in the current crystal structure (20) , this is accomplished by the protrusion of the CDR3 loop of the antibody heavy chain. Antibodies against CD4i epitopes need to bind viruses before CD4 binding occurs to achieve neutralization (47) . The reason is that once the envelope glycoprotein complex binds cell surface CD4 , there are severe steric constraints on the binding of an antibody to the gpl20 surface facing the target cell (Fig. 6) .
Another fairly conserved gpl20 neutralization epitope is recognized by the 2G12 antibody (48) . Unlike the other characterized HIV-1 neutralizing antibodies, which recognize gpl20 structures near or within the receptor- binding sites, the 2G12 antibody apparently binds an epitope in the outer domain (Fig. 7b) . Given the variability in this outer domain, the ability of the 2G12 antibody to neutralize a fair number of HIV-1 strains (48) seems paradoxical. The marked sensitivity of 2G12 antibody may recognize more conserved carbohydrate structures formed as a result of the heavy concentration of N-linked glycosylation in the gpl20 outer domain. The apparent rarity with which 2G12-like antibodies are elicited attests to the success of the viral strategy of employing a heavily glycosylated outer domain surface in immune evasion.
The HIV-1 envelope glycoproteins as vaccine components. That the human and simian immunodeficiency virus envelope glycoproteins are not ideal immunogens is an expected consequence of the immunological selective forces that drove the evolution of these viruses. The same features of the envelope glycoproteins that dictate poor immunogenicity in natural infections have hampered vaccine development. The lability of envelope glycoprotein complex has frustrated attempts to present oligomers mimicking the functional spike to the immune system. As discussed above, the disintegration of envelope glycoprotein oligomers contributes to the preferential elicitation of non-neutralizing antibodies by the newly exposed gpl20 N- and C-termini. Regardless of the context in which the envelope glycoproteins are presented, the gpl20 variable loops elicit the majority of neutralizing antibodies, probably due to the exposed nature of these epitopes. It is still unclear whether conserved features in the V2 and V3 variable loops exist that can be exploited in vaccine design, or whether all possible functional configurations of these variable structures need to be represented in a cocktail of immunogens .
The discontinuous gpl20 structures surrounding the receptor binding sites exhibit a relatively high degree of conservation (20), in keeping with the minimal polymorphism in the host cell receptors. The CD4 binding site contributes a particularly attractive target. It appears to be accessible to antibodies, more so than the conserved elements of the chemokine receptor-binding region. A large fraction of the broadly neutralizing antibodies that eventually appear in HIV-1-infected individuals is directed against the CD4 binding site (43), indicating that ability of the human immune system to recognize this gpl20 region and to generate an appropriate response. Nonetheless, these antibodies have been difficult to elicit in animals and vaccinated humans
(49) . The reasons for the relatively poor immunogenicity of the CD4 binding site are not yet understood, although several possibilities can be envisioned. Masking by variable loops (19,33) and glycosylation may contribute to the recessed nature of the CD4BS epitopes which, even on the crystallized gpl20 core, occupy a 20 A° deep canyon (20) . Within the CD4 -binding pocket, not all of the gpl20 surface is conserved among HIV-1 strains. Therefore, even when elicited, some CD4BS-directed antibodies may lack the breadth and affinity to be optimal neutralization agents. While many monoclonal antibodies against the CD4 binding site exhibit reasonable potency and breadth (44) , whether a polyclonal response against the envelope glycoprotein can be focused to preferentially contain these types of antibodies remains to be seen.
The conserved element near the chemokine receptor-binding site will be difficult target for vaccine-elicited antibodies. Known monoclonal antibodies to the CD4i epitopes must interact with virus prior to CD4 binding if neutralization is to be achieved (47) . Yet these gpl20 structures are poorly exposed in the absence of CD4 , in large part due the overlying V2 loop (33) . This is consistent with the relative rarity with which these antibodies appear to be elicited in HIV-1-infected humans (46) . Attempts to expose these structures better on gpl20 -based antigens seem warranted.
Summary The HIV-1 envelope glycoproteins have evolved to be inefficient at eliciting effective antiviral antibody responses. The availability of structural information on the conserved HIV-1 gpl20 neutralization epitopes should facilitate the modification of this important antigen and allow the rational testing of hypotheses regarding its poor immunogenic properties. These efforts should complement ongoing efforts to improve antigen presentation to the immune system and to create suitable animal models for the screening of vaccine candidates .
References for the Second Series of Experiments
1. F. Barre-Sinoussi et al . Science 220,868(1983); R.C. Gallo et al . , ibid. 224,500(1984); M.D. Daniel et al., ibid. ,228,1201 (1985) ; N.L. Letvin et al . , ibid. 230,71 (1985) .
2. R.W. Coombs et al . , N. Engl . J. Med. 321,1626(1989); S.J. Clark et al . , ibid. 324,950(1991); E. S. Daar, T. Moudgil, R. Meyer, D.D. Ho., ibid. 961(1991); A. S. Fauci, G. Pantaleo, S. Stanley, D. Weissman, Ann.
Int. Med. 124(1996); D.D. Ho, T. Moudgil, M. Alam, N. Engl. J. Med.321, 1621 (1989) .
3. J.S. Allan et al . , Science 228,1091(1985); W.G. Robey et al . , ibid. 229,1402(1985) .
4. P.L. Earl, B. Moss, R. W. Doms, J. Virol. 65,2047(1991); P.L. Earl, R.W. Doms, B. Moss, Proc. Natl. Acad. Sci. U.S.A. 87,648(1990); A. Pinter et al., J. Virol. 63,2674(1989); M. Lu, S. Blacklow, P.
Kim. Nature Struct. Biol. 2,1075(1995); CD. Weiss, J. Levy, J. M. White, J. Virol. 64,5674(1990) .
5. D.C. Chan, D. Fass, J. M. Berger, P.S. Kim, Cell 89,263(1997); W. Weissenhorn, A. Dessen, S.C.
Harrison, J.J. Skehel, D.C. Wiley, Nature 387,426 (1997) .
6. C.K. Leonard, et al . , J. Biol. Chem. 265,10373 (1990) .
7. P.L. Earl and B. Moss, AIDS Res. Hum. Retroviruses 9,589(1993); D. Sugars, C. Wild, T. Greenwell, T. Matthews, J. Virol. 70, (1996).
E . Helseth, U . Olshevsky, C . Furman, J . Sodroski , J . Virol . 65 , 2119 ( 1991 ) . 9. J.A. McKeating, A. McKnight , J.P. Moore, J. Virol. 65,852(1991); W.P. Tsai, S. R. Conley, H. Kung, R. Garrity, P. Nara, Virology 226,205(1996) .
10. P.W. Berman and G.R. Nakamura, AIDS Res. Hum.
Retroviruses 10,585(1994); R. Cantin, J.-F. Fortin,
G. Lamontagne, M. Tremblay, Clood 90,1901(1997); R.
Cantin, J.-F., G. Lamontagne, M. Tremblay, J. Virol.
71,1922(1997); J.-F. Fortin, R. Cantin, M. Tremblay, ibid. 72,2105(1998); Castilleti et al . , AIDS Res.
Hum. Retroviruses 11,547(1995); J. -F. Fortin, R.
Cantin, G. Lamontagne, M. Tremblay, J. Virol.
71,3588(1997); I. Frank et al . , AIDS 10,1611(1996);
M.M.L. Guo and J.E.K. Hildreth, AIDS Res. Hum. Retrovirouses 11,1007(1995); L.E. Henderson et al . ,
J. Virol. 61,629(1987); J. Hoxie et al . ,
Hum. Immunol. 18,39(1987); J. Hoxieet al . , Hum.
Immunol. 18,39(1987); G. Pantaleo et al . , J. Exp .
Med. 173,511(1991); CD. Rizzuti and J. G. Sodroski, J. Virol. 71,4847(1997); J. Rossio, J. Bess, L.E.
Henderson, P. Cresswell, L.A. Arthur, AIDS Res. Hum.
Retrovirouses 11, 1433(1995); M. Saifuddin et al . ,
Exp. Med. 182,501(1995).
11. D. Klatzmann et al . , Nature 312,767(1984); A. G. Dalgleish et al . , ibid. 312,763(1984); J.S. McDougal et al., Science 312,763(1984); J. S. McDougal et al., Science 231,382(1986).
12. Y. Feng, C. Broder, P. Kennedy, E. Berger, Science 272, 872(1996); H. Choe et al . , Cell 85,1135(1996); H.K. Deng et al . , Nature 381, 661(1996); T. Dragic et al., ibid, 667(1996); B.J. Doranz et al . , Cell 85,1149(1996); G. Alkhatib et al . , Science 272,1955 (1996) .
13. Q. Sattentau and J. Moore, J. Exp. Med. 174,407(1991); M. Thali et al . , J. Virol. 67,3987(1993); Q. Sattentau, J. Moore, F. Vignaux, F. Traincard, P. Poignard, ibid. 67 7383(1993) .
14. A. Trkola et al . , Nature 384,184(1996); L. Wu et al . , ibid., 179(1996); C. Lapham et al . , Science 274,602(1996); J.C Bandres et al . , J. Virol. 72,2500(1998); CM. Hill et al . , ibid. 71,6296 (1997) .
15. P. Chapman, A. McKnight, R. Weiss, J. Virol. 66 , 3531(1992); J. Reeves and T. Shulz, ibid. 71,1453(1997); M. J. Endres et al . , Cell 87,745(1996); J. Dumonceaux et al . , J. Virol. 72,512(1998); A L. Edinger et al . , Proc. Natl. Acad.
Sci. U.S.A. 94,14742(1997); K. Martin , et al . , Science 278,1470(1997) .
16. B.J. Willett, M.J. Hosie, J. C. Neil, J. D. Turner, J. A. Hoxie, Nature 385,587 (199&) ; B.J. Willett et al., J. Virol. 71,6407(1997); M.J. Hosie et al . , ibid. 72,2097 (1998) .
17. B.R. Sarcich et al . , Cell 45,637(1986) .
18. J. Moore, Q. Sattentau, R. Wyatt, J. Sodroski, J. Virol. 68, 469(1994); S. Pollard, M.D. Rosa, J. Rosa, D.C. Wiley, EMBO J. 11,585(1992) .
19. R. Wyatt et al . , J. Virol. 67,4557(1993) .
20. P. Kwong et al . , submitted; P. Kwomg et al . , in preparation; R. Wyatt et al . , in preparation.
21. A. Peterson and B. Seed, Cell 54,65(1988); J. Arthos et al., ibid. 57, 469 (1989) ;M. Brodsky, M. Warton, R. M. Myers, D.R. Litman, J. Immunol. 144,3078(1990); A. Ashkenzi et al . , Proc. Natl. Acad. Aci . U.S.A. 87,7150(1990); H. Choe et al . , J. Aids 5,204(1992); S. E. Ryu et al . , Nature 348,419(1990); J. Wang et al., ibid. 411(1990); H. Wu, P. Kwong, W. Hendrickson, ibid 387,527(1997).
22. L. Lasky et al . , cell 50,975(1987; A. Cordonnier et al., Nature 340,571(1989); A. Cordonnier et al . , J. Virol 63,4464(1989); U. Olshevsky et al . , ibid. 64,5701 (1990) .
23. M. Kowalski et al . , Science 237,1351(1987) .
24. R.I. Connor, K. Sheridan, D. Ceradini, S. Choe, N. Landau, J. Exp. Med. 185,621(1997); L. Zhang, Y.
Huang, T. He, Y. Cao, D. D. Ho, Nature 383,768(1996); A. Bjorndal et al . , J. Virol, 71, 7478(1997); M. Dean et al . , Science 273,1856(1996); R. Liu et al., Cell 86,367(1996); W.A. Paxton et al., Nat. Med. 2,412(1996); M. Samsonet al . , Nature
382, 722 (1996) .
25. F. Cocchi et al . , Nature Med. 2,1244(1996); P.D. Bieniasz et al . , EMBO J. 16,2599(1997); R. Speck et al., J. Virol. 71, (1997) .
26. L. Marcon et al . , J. Virol. 71,2522(1997); Z. Chen, P. Zhou, D. Ho, N. Landau, P. Marx, ibid. 71,2705(1997) ; F. Kichhoff et al . , ibid..71,6509 (1997) ; J. Rucker et al . , ibid.
8999(1997); N/ Sol et al . , ibid. 71,82837(1997) .
27. C. Rizzuto et al . , in preparation.
28. CM. Carr and P.S. Kim, Cell 73,823(1993); P. Bullough, F. Hughson, J/ Skehel, D.C. Wiley, Nature 371,37(1994); W. Weissenhom et al . , EMBO J. 15 , 1507 ( 1996 ) .
29. E. Freed, D. Myers, R. Risser, Proc. Natl. Acad. Sci. U.S.A. 87,4650(1990); J. Cao et al . , Virol. 67,2747(1993); J.M. Felser, T. Klimkait, J. Silver,
Viorlogy 170,566(1989); H. Schaal, M/ Klein, P. Gehrmann, 0. Adams, A. Scheid, J. Virol. 69,3308(1995); M. Delahunty, I. Rhee, E. Freed, J. Bonifacino, Virology 218,94(1996); J.W. Dubay, S. Roberts, B. Brody, E. Hunter, J. Virol.
66,4748 (1992) .
30. C.-H. Chen, T.J. Matthews, C.B. McDanal, D.P. Bolgnesi, M.L. Greenberg, J. Virol. 69,3771(1995); C. Wild., T. Oas, C McDanal, D. Bolgnesi, T.
Mattews, Proc. Natl. Acad. Sci. U.S.A.
89,10537(1992); S. Jiang, K. Lin, N. Strick, A.R.
Neurath, Nature 365,113(1993); S. Jiang, k. Lin, N.
Strick, A.R. Neurath, BBRC 195,533(1993).
31. R. Wyatt, wtal . , J. Virol. 71,9722(1997).
32. J. Moore and J. Sodorski, ibid. 70,1863(1996).
33. R. Wyatt et al . , ibid. 69,5723(1995); J. Cao et al . , ibid 71, 9722 (1997) .
34. E. Daar, X. L. Li, T. Moudgil, D.D. Ho, Proc. Natl.
Acad. Sci. U.S.A. 87,6574(1990); P.J. Gomatos et al., J. Immunol. 144,4183(1990); T. Wrin et al . , J.
Virol. 69,39(1995); J.R. Mascola et al . , J. Inf.
Dis. 169,48(1994); L. Sawyer et al . , J. Virol.
68,1342(1994); N. Sullivan, Y. Sun, J. Li, W.
Hofmann, J. Sodroski, J. Virol. 69,4413(1995); J. P. Moore and D.D. Ho, AIDS 9,SH7 (1995); W. O'Brien,
S. Mao, Y. Cao, J. Moore, J. Virol. 68,5264(1994);
Y. Xhang, R. Fredriksson, J. MckEating, E. M. Fenyo, Virology 238,254(1997); T. Mattewsm AIDS Res Human Retrovisruses 10,631(1994) .
35. T. Morikita et al . , AIDS Res. Human Retrovirsues 13,1291(1997); A. Koito, G. Harrow, J. Levy, C
Cheng-Meyer, J. Virol. 68,2253(1994); S. Hwang, T.
Boyle, H.K. Lyerly, B. Cullen, Science
257,535(1992); N. Sullivan et al . , submitted.
36. T. Fouts, J. Binley, A. Trkola, J. Robinson, J. P. Moore, J.
37. J. Wang, S. Steel, R. Wisniewolski, CY. Yang, Proc. Natl. Acad. Sci., U.S.A. 83,6159(1986); J. W. Gnannm Jr. J. Nelson, M.A. B. Oldstone, J. Virol.
61,2639(1987); S. Karwowska et al . , AIDS Res. Human Retroviruse 8,1099(1992); J. Krowka et al . , Clin. Immunol. Immunopathol. 59. 53(1991); T.J. Palker et al ., Proc. Natl., Acad. Sci. U.S.A. 84,2479(1987); J. M. Bunley et al . , AIDS Res. Hum. Retroviruses
12,911 (1996) .
38. P. Nara et al . , J. Virol. 61, 3173(1987); P. Nara et al., ibid. 64,3779(1990); A. Gegerfelt, J. Albert, L. Morfeldt-Manson, K. Broliden, E. M. Fenyo, Virology 185, 162, (1991); M. Arendrup et al . ,
JAIDS5, 303(1992); J. Li et al . , J. Virol. 69,7061(1995) .
39. J. Mckeating et al . , J. Virol. 67,4932(1993); M.S.C Fung et al . , ibid. 66,8489(1992); J. Moore et al . , ibid. 67, 6136(1993); M. Grony et al . , ibid. 68,8312(1994); C Shotton et al . , ibid. 69,222(1995); H. Ditzel et al, J. Immunol
154,893 (1995) .
40. T. Ohnoet al . , Proc. Natl. Acad. Sci. U.S.A. 88,10726(1991); J.P. Moore et al . , J. Virol.69, (1995); M. Gorny et al . , ibid. 66,7538(1992).
41. E. Helseth et al . , J. Virol. 64,2146(199); E. Freed, D. Myers, R. Risser, ibid. 65, 190(1991); L. Ovanoff et al., AIDS Res. Hum. Retroviruses 7,595(1991); K.
Page, S. Stearns, D. Littman, J. Virol. 66,524(1992); N. Sullivan, M. Thali, C Furman, D. Ho, J. Sodroski, ibid. 67,3674(1993) .
42. G. Myers et al, Human Retroviruses and AIDS 1996 Los Almos National Laboratory, Los Alamos, N.M.) .
43. A. Profy et al . , J. Iimunol. 144, 4641(199)); I. Berkower, G. Smith, C Girl, D. Murphy, J. Exp. Med. 170, 1681(1989); C.-y. Kang et al, Proc. Natl. Acad.
Sci. U.S.A. 88,6171(1991); K.S. Steimer, C.J.
Scandella, P. V. Skiles, N.L. Haigwood, Scienc
254,105(1991); J.P> Moore and D.D. Ho, J. Virol.
67,863 (1993) .
44. M. Posner et al . , J. Immunol. 146, 4325(1991; S.A. Tilley, W. Honnen, M. Racho, M. Hilgartner, A. Pinter, Res. Virol. 142,247(1991); D.R. Burton et al., Science 26,1024(1994); D.D. Ho, et al . , J. Virol. 65,489(1991); S. Karbobska et al . , AIDS Res.
Human Retroviruses 8, 689(1992) .
45. M. Thali et al . , J. Virol. 65, 6188(1991); M. Thali et al., ibid. 66,5635(1992); J. McKeating et al . , Virology 190, 134 (1992) ;M. Shutten et al . , AIDS 7,
919 (1993) .
46. M. Thali et al . , J. Virol. 67,3978(1993); C.-Y. Kang, K. Harihan, M.R. Posner, P. Nara, J. Immunol. 151, 449 (1993) .
47. N. Sullivan et al . , submitted. 48. T. Muster et a;, J. Virol. 67,6642(1993); A Trkola., ibid. 70, 1100 (1996) .
49. J. Rusche et al . , Proc. Natl. Acad. Sci. U.S.A. 84, 6924(1987); J. Klaniecki et al . , AIDS Res. Hum.
Retroviruses 7,791(1991); N. Haigwood et al . , J. Virol. 66,172 (1992) .
Third Series of Experiments
The entry of human immunodeficiency virus (HIV) into cells requires sequential interactions of the viral exterior envelope glycoprotein, gpl20, with the CD4 glycoprotein and a chemokine receptor on the cell surface. These interactions initiate a fusion of the viral and cellular membranes. Although gpl20 can elicit virus-neutralizing antibodies, HIV eludes the immune system. We have solved the X-ray crystal structure at 2.5u resolution of an HIV-1 gpl20 core complexed with a two-domain fragment of human CD4 and an antigen-binding fragment of a neutralizing antibody that blocks chemokine-receptor binding. The structure reveals a cavity-laden CD4-gpl20 interface, a conserved binding site for the chemokine receptor, evidence for conformational change upon CD4 binding, the nature of a CD4-induced antibody epitope, and specific mechanisms for immune evasion. Our results provide a framework for understanding the complex biology of HIV entry into cells and will guide efforts to intervene.
Introduction
Human immunodeficiency viruses, HIV-1 and HIV-2, and the related simian immunodeficiency viruses (SIV) cause the destruction of CD4+ lymphocytes in their respective hosts, resulting in the development of acquired immunodeficiency syndrome (AIDS) (1, 2) . The entry of HIV into host cells is mediated by the viral envelope glycoproteins, which are organized into oligomeric, probably trimeric, spikes displayed sparsely on the surface of the virion. These envelope complexes are anchored in the viral membrane by the gp4l transmembrane envelope glycoprotein. The surface of the spike is composed primarily of the exterior envelope glycoprotein, gpl20, associated by noncovalent interactions with each subunit of the trimeric gp41 glycoprotein complex (3, 4.) When the gpl20 sequences of different primate immunodeficiency viruses were initially compared, five variable regions (V1-V5) were identified5. The first four variable regions form surface-exposed loops that contain disulfide bonds at their bases6. The conserved gpl20 regions form discontinuous structures important for the interaction with the gp41 ectodomain and with the viral receptors on the target cell. Both conserved and variable gpl20 regions are extensively glycosylated6. The variability and glycosylation of the gpl20 surface likely modulate the immunogenicity and antigenicity of the gpl20 glycoprotein, which is the major target for neutralizing antibodies elicited during natural infection (7) .
Entry of primate immunodeficiency viruses into the host cell involves the binding of the gpl20 envelope glycoprotein to the CD4 glycoprotein, which serves as the primary receptor. The gpl20 glycoprotein binds to the most amino-terminal of the four immunoglobulin- like domains of CD4. Structures of both the N-terminal two domains (8, 9) and the entire extracellular portion of CD410 have been determined, and mutagenesis studies indicate that the CD4 structure analogous to the second complementarity-determining region (CDR2) of immunoglobulins is critical for gpl20 bindingll, 12. Conserved gpl20 residues important for CD4 binding have likewise been identified by mutagenesis (3, 13, 14) .
CD4 binding induces conformational changes in the gpl20 glycoprotein, some of which involve the exposure and/or formation of a binding site for specific chemokine receptors. These chemokine receptors, mainly CCR5 and CXCR4 for HIV, serve as obligate second receptors for virus entry (15, 16.) The gpl20 third variable (V3) loop is the major determinant of chemokine receptor specificity (17) . However, other more conserved gpl20 structures that are exposed upon engagement of CD4 also appear to be involved in chemokine-receptor binding. This CD4-induced exposure is indicated by the enhanced binding of several gpl20 antibodies (18, 19) which, like V3-loop antibodies, efficiently block the binding of gpl20-CD4 complexes to the chemokine receptor (20) . These are called the CD4-induced (CD4i) antibodies. CD4 binding may trigger additional conformational changes in the envelope glycoproteins. For example, the binding of CD4 to the envelope glycoproteins of some HIV-1 isolates induces the release or "shedding" of the gpl20 protein from the complex (21) , although the relevance of this process to HIV entry is uncertain.
HIV and related retroviruses belong to a class of enveloped fusogenic viruses that includes corona-, paramyxo- and orthomyxoviruses (e.g. influenza virus), all of which require post-translational cleavage for activation. The transmembrane coat proteins of these viruses (gp41 equivalents) share sequence resemblance, particularly in their N-terminal fusion peptides, and they participate directly in membrane fusion. The ectodomain of gp41 can form a coiled coil resembling that of influenza hemagglutinin HA (23, 4, 22,) supporting the notion that this class of viruses may share some common aspects with respect to virus entry. In other respects, enveloped viruses tend to be distinctive. They use varying modes of entry (direct membrane penetration for HIV, endocytosis for influenza virus) and even otherwise closely related viruses may use individualized receptors. The exterior coat proteins (gpl20 equivalents) are accordingly specialized. Thus, for example, there is no detectable similarity in sequence, nor now in structure, between the receptor binding portion of HIV and that of murine leukemia virus (23), another retrovirus. Mechanisms for receptor-mediated triggering of fusion may also be virus specific.
Because of the key role that the gpl20 glycoprotein plays in receptor binding and in interactions with neutralizing antibodies, knowledge of the gpl20 structure is important for understanding HIV infection and for the design of therapeutic and prophylactic strategies. Here, we report the crystal structure, at 2.5 A° resolution, of an HIV-1 gpl20 core bound to a two-domain fragment of the CD4 cellular receptor and to the antigen-binding fragment
(Fab) of an antibody, 17b, that is directed against a
CD4i epitope. A companion report relates this structure to the antigenic properties of the gpl20 envelope proteins (24) .
Structure determination
The extensive glycosylation and conformational heterogeneity associated with the HIV-1 gpl20 glycoprotein recommended a crystallization strategy aimed at radical modification of the protein surface. We made truncations at termini and variable loops in various combinations with gpl20 from various strains, extensively deglycosylated these gpl20 variants, and produced complexes with various ligands. A theoretical analysis showed that the probability of crystal formation is greatly increased by such reduction of surface heterogeneity and trials with multiple variants (25) . After screening almost twenty combinations of gpl20 variants and ligands, we obtained crystals of a ternary complex composed of a truncated form of gpl20, the N-terminal two domains (D1D2) of CD4 , and an Fab from the human neutralizing monoclonal antibody 17b (18, 25) .
The crystallized gpl20 is from the HXBc2 strain of HIV-1. It has deletions of 52 and 19 residues from the N- and C- termini, respectively; Gly-Ala-Gly tripeptide substitutions for 67 Vl/V2-loop residues and 32 V3-loop residues; and the removal of all sugar groups beyond the linkages between the two core N-acetylglucosamine residues. This deglycosylated core gpl20 eliminates over 90% of the carbohydrate but retains, over 80% of the non-variable-loop protein. Its capacity to interact with CD4 and relevant antibodies is preserved at or near wild-type levels26. The crystals are of space group P2221 (a=71.6, b=88.1, c=196.7A°) with one ternary complex and 60% solvent in the crystallographic asymmetric unit .
The ternary structure was solved by a combination of molecular replacement, isomorphous replacement, and density modification techniques. It has been refined to an R-value of 21.0% (5-2.5 A° data > 2s, R-free=30.3%) . The final model, composed of 7877 atoms comprises residues 90-396 and 410-492 of gpl20 (excepting loop substitutions), residues 1-181 of CD4 , and residues 1-213 of the light chain and 1-229 of the heavy chain of the 17b monoclonal antibody. In addition, 11 N-acetylglucosamine and 4 fucose residues, and 602 water molecules have been placed. The overall structure of the complex of gpl20 with D1D2 of CD4 and Fab 17b is as depicted in Fig. 8.
Structure of gpl20
The deglycosylated core of gpl20 as dissected from the ternary complex approximates a prolate ellipsoid with dimensions of 50 x 50 x 25ύ, although its overall profile is more heart-shaped than circular. Its backbone structure is shown in Figs. 9a & c in an orientation precisely perpendicular to that in Fig. 8 (Fig. He gives a mutually perpendicular view) . This core gpl20 comprises 25 b strands, 5 a helices and 10 defined loop segments, all organized with the topology shown in Fig. 9b. Specific spans of structural elements are given in Fig. 9d. The structure confirms the chemically determined disulfide bridge assignments (6; Fig. 9c) . The polypeptide chain of gpl20 is folded into two major domains plus certain excursions that emanate from this body. The inner domain (inner with respect to the N- and C-termini) features a two-helix, two-strand bundle with a small five-stranded b sandwich at its termini-proximal end and a projection at the distal end from which the V1/V2 stem emanates. The outer domain is a stacked double barrel that lies alongside the inner domain such that the outer barrel and inner bundle axes are approximately parallel.
The proximal barrel of the outer-domain stack is composed from a 6-stranded, mixed-directional b sheet that is twisted to embrace helix a2 as a 7th barrel stave. The distal barrel of the stack is a 7-stranded antiparallel b barrel. The two barrels share one contiguous hydrophobic core, and the staves also continue from one barrel to the next except at the domain interface. This interruption is centered at a side between barrels where the chain enters the outer domain with loop LB insinuated as a tongue between strands bl6 and b23. The extended segment just preceding LB is like an 8th stave of the distal barrel, but it is slightly out of reach for hydrogen bonding with its bl6 and b9 neighbors. The chain returns to complete the inner domain after b24.
The proximal end of the outer domain includes variable loops V4 and V5 and loops LD and LE, which are variable in sequence as well. Loop LC is also at this end, close in space to loop LA of the inner domain, although by topology it is at the other end of this domain. The distal end does include the stem of the excised variable loop V3 and also an excursion via loop LF into a b hairpin, b20-b21, which in turn hydrogen bonds with the VI/V2 stem emanating from the inner domain. This completes an antiparallel , 4-stranded "bridging sheet" that stands as a peculiar minidomain in contact with, but distinct from, the inner and outer domains as well as the excised V1/V2 domain. This bridging sheet also participates in the separated interactions of gpl20 with both CD4 and the 17b antibody (Fig. 8 and below) . One further excursion from the body of the outer domain produces strand bl5 and helix a3 , which are also important in CD4 binding.
Taken as a whole the structure of gpl20 seen here is novel. Moreover, our domain-level searches have failed to reveal similarity of the inner domain to any known atomic structures, although the missing terminal segments might conceal relationships. We do, however, find a fragmentary similarity for portions of the outer domain with known structures. In particular, part of the protomer of FabA dehydrase (27) is like part of the proximal barrel, and dUTP pyrophosphatase (28) has elements in common with both barrels of the outer domain. In each case the superimposable fraction is limited. For FabA, 45 of its 171 C-alpha atoms superimpose on five segments, but the rest are topologically unrelated. For dUTPase, 41 of its 152 C-alpha atoms appropriately capture 8 of the 15 segments in the outer domain body, but there is no helix corresponding to alpha-2 and the placements of termini are not comparable. Interestingly, several viruses related to HIV encode dUTPases; however, we have not found sequence evidence to support a possible role in coat protein evolution.
This structure of core gpl20 should be a prototype for the class. As shown in the structure-based alignment of representative sequences (Fig. 9d) , there is substantial conservation despite the noted variability among HIV strains . Thus, even an HIV-2 sequence is 35% identical with that of the HXBc2 strain expressed in this crystallized construct, and the identity level rises to 77% and 51%, respectively, for the more closely related HIV-1 clade C and clade O representatives. The inner domain is appreciably more conserved than the outer domain with 86%, 72% and 45% identity for the respective C, O and HIV-2 comparisons. Variability correlates with the degree of solvent exposure of residues (Fig. 9d) , in keeping with the conservation of hydrophobic cores. The seven disulfide bridges retained in core gpl20 are absolutely conserved and mostly buried (Fig. 9c) . Glycosylation sites are all surface exposed and are conserved above average (Fig. 9d) . The previously identified HIV variable segmentsδ are all on loops connecting elements of secondary structure, and loops LD and LE are also especially variable. Indeed, LE is more variable than V5 in light of current sequence data. These loops are also relatively mobile as reflected in high B factors or disorder, as in V4. Interestingly, variable segments in the outer domain, including the exposed face of a2 , appear to arise from neutral mutation rather than selective pressure since they are on non-immunogenic surfaces, presumably masked by glycosylation.
CD4-gpl20 interaction
CD4 is bound into a depression formed at the interface of the outer domain with the inner domain and the bridging sheet of gpl20 (Figs. 10a) . This interaction buries a total of 742 A°2 from CD4 and 802 A°2 from gpl20. The surface areas that are actually in contact are considerably smaller (Fig. lOd) because an unusual mismatch in surface topography creates large cavities that are occluded in the interface, as described below. There is, however, a general complementarity in electrostatic potential at the surfaces of contact, although the match is imprecise in this respect as well. The focus of CD4 positivity is displaced from the center of greatest negativity on gpl20 (Fig. 10c) . The binding site is devoid of carbohydrate (Fig. lOg) . The structure of CD4 in this complex differs only locally from that in free D1D2 structures and at only a few places : residues 17-20 at the poorly ordered CDRl-like loop and residues 41,42,47,49 and 60, which are at or near the contact site and have low B factors in the gpl20-bound state.
Direct interatomic contacts are made between 22 CD4 residues and 26 gpl20 amino-acid residues. These include 219 van der Waals contacts and 12 hydrogen bonds. Residues in contact are concentrated in the span from 25 to 64 of CD4 , but they are distributed over six segments of gpl20 (Figs. 9d & lOi) : 1 residue from the V1/V2 stem, loop LD, the beta-15-alpha-3 excursion, the beta-20-beta- 21 hairpin, strand beta-23 and the beta-24-alpha-5 connection. These interactions are compatible with previous analyses of mutational data on both CD411, 12, 29 and gpl203, 13, 14. Other groups are also involved, including some at gpl20 sites that have not been tested, but residues identified as critical for binding do indeed interact with one another (Fig. lOe) . Most importantly, Phe 43 and Arg 59 of CD4 make multiple contacts centered on residues Asp 368, Glu 370 and Trp 427 of gpl20, which are all conserved among primate immunodeficiency viruses. In fact, 63% of all interatomic contacts come from one span (40-48) in CC" of CD4 , and Phe 43 alone accounts for 23% of the total. Similarly, with respect to gpl20, the spans of 365-371 and 425-430 contribute 57% of the total . Of the three CD4 lysine residues implicated in binding (residues 29, 35 & 46), only Lys 29 makes a direct ionic hydrogen bond, and while Asp 457 of gpl20 is near to these electropositive groups (Figs. lOe & i) it does not make hydrogen bonds .
Several gpl20 residues that are covered by CD4 are variable in sequence. This variation is accommodated in part by the large interfacial cavity (Fig. lOe) . The gpl20 residues in contact with this water-filled cavity are especially variable (Fig. lOg) . Moreover, half of the gpl20 residues that make contacts with CD4 do so only through main-chain atoms (including Cb) of gpl20, and 60% of CD4 contacts are made by gpl20 main-chain atoms (Fig. lOf) . Included among these are 5 of the 12 hydrogen bonds in the interface. One such contributing element is an antiparallel b-sheet alignment of CD4 strand C" with gpl20 strand beta-15 (Figs. 10a & i) .
Atomic details of the interaction are particularly intricate and unusual for the contacts made between gpl20 and the mutationally critical CD4 residues Phe 43 and Arg 59 (Fig. lOj). Arg 59 interacts with Asp 368 and Val 430. The carboxylate group of Asp 368 makes double hydrogen bonds with the guanidinium Nh atoms of Arg 59, but it also hydrogen bonds back to the backbone NH group of residue 44 and it appears to be optimally positioned to receive a CH...0 hydrogen bond (3.20 A°) from the Phe 43 ring. Phe 43 interacts with residues Glu 370, He 371, Asn 425, Met 426, Trp 427 and Gly 473 as well as Asp 368, but only the contacts with He 371 have a conventional hydrophobic character. Those to 425-427 and 473, including Trp 427, are only to backbone atoms. A surprisingly large fraction of the Phe 43 contacts (28%) are to polar groups. The phenyl group is stacked on the carboxylate group of Glu 370, and there are contacts with the carbonyl oxygen atoms of residues 425, 426 and 473 and the NH group of Trp 427. Indeed, at a distance of 3.10 A, the phenyl contact with O 425 is a second candidate CH...0 hydrogen bond. Asp 368 and Glu 370 have their carboxylate groups close together (3.54 A) and they are, of course, buried in the complex. Even for gpl20 excised from the complex, their fractional surface accessibilities are only 44% and 14%, respectively. Glu 370 may therefore be protonated. Perhaps the most extraordinary aspect of this site is the large cavity beyond Cz of Phe 43 (Figs. 10b & h, and below) .
Interfacial cavities
Analysis of the solvent accessible surface of the ternary complex reveals a number of topologically interior surfaces or cavities. Two of these, both at the gpl20-CD4 interface, are unusually large. The larger (279 A°3) is formed at the interface between the slightly concave middle of the CC'C" portion of the CD4 sheet, and a groove on gpl20 where beta-23 and beta-24 are indented relative to beta-15 and the LD loop (Fig. lOe) . The second is from a pocket in the gpl20 surface that is plugged by Phe 43 from CD4 (152 A°3) . This pocket is itself at the interface between the inner and outer domains of gpl20 (Fig. lOh) . Several other smaller cavities are also wedged at the interface between the two gpl20 domains.
The larger cavity is lined by mostly hydrophilic residues, half derived from gpl20 and half from CD4. It is not deeply buried; while formally a cavity in the crystal structure, minor changes in sidechain orientation would make it solvent accessible. The observed electron density and predicted hydrogen bonding are consistent with at least 8 water molecules in the cavity. Residues from gpl20 that actually line the cavity (including Ala 281, Ser 364, Ser 365, Thr 455, Arg 469) exhibit sequence variability, whereas surrounding this variable patch are conserved residues, the substitution of which affect CD4 binding. These include the critical contact residues Asp 368, Glu 370 and Trp 427, which flank one end of the cavity, and Asp 457 at the other end (Fig. lOe) . Similarly, CD4 residues that line the cavity (e.g., Gin 40 and Lys 35) can be mutated with only moderate effect on gpl20 binding, whereas Arg 59 suffers less loss of solvent accessible surface upon gpl2_0 binding but is highly sensitive to mutation. This cavity thus serves as a water buffer between gpl20 and CD4 (Fig. lOe) . The tolerance for variation in the gpl20 surface associated with this cavity produces a variational island (Fig. lOg) , or "anti-hot spot", which is centrally located between regions required for CD4 binding, and may help the virus escape from antibodies directed against the CD4 binding site.
The "Phe 43" cavity (Fig. 10b & h) is very different in character from the larger binding- interface cavity. It is roughly spherical, with a diameter of ~8 A° (atom center to atom center) across the center of the cavity. It is positioned just beyond Phe 43 of CD4 , at the intersection of the inner domain, the outer domain and the bridging sheet. It is relatively deeply buried, extending into the hydrophobic interior of gpl20. The phenyl ring of Phe 43 is the only non-gpl20 residue contacting this cavity, forming a lid which covers the bottom of the cavity (Fig. 10b) . Other routes of solvent access are possible: past Met 426 under the bridging sheet, or directly through the heart of gpl20, at the inner domain-barrel domain interface. Ordered water molecules demarking possible paths of solvent access are found along both routes. Nonetheless, in the cavity itself, only a few water molecules are observed. The center of the cavity is dominated by a large piece of spherical density, which is over 4 A° from any protein atom (Fig. 10b) . The size, shape and predicted hydrogen bonding of this density is inconsistent with those expected for water, isopropanol, ethylene glycol, or any of the other major crystallization components. We have been unable to identify the source of this density.
Residues that line the Phe 43 cavity (side chains of Trp 112, Val 255, Thr 257, Glu 370, Phe 382, Tyr 384, Trp 427 and Met 475; main chains of 255-257 and 375-377) are primarily hydrophobic. They are also highly conserved, as much so as the buried gpl20 hydrophobic core. Despite a lack of steric hindrance, almost no substitutions to larger residues are found. Given the frequency of gpl20 sequence divergence, such conservation strongly implies functional significance. Indeed, although residues that line this cavity provide little direct contact to CD4 , they do nevertheless affect the gpl20-CD4 interaction. Thus, mutations at Thr 257 (no contacts) and Trp 427
(only main-chain contacts) can substantially reduce binding. Changes in cavity-lining residues also affect the binding of antibodies directed against the CD4 binding site. In addition, many of the residues that line the cavity interact with elements of the chemokine receptor binding region (see below) . It may be that the Phe 43 cavity and the other interdomain cavities form as a consequence of a CD4-induced conformational change (see below) .
Despite this unusual cavity-laden interface between gpl20 and CD4 interface, we believe that this structure reflects the true character of the interaction. Core gpl20 binds CD4 with essentially the same affinity26 and residues identified as critical by mutational analysis on both components are indeed at the focus of contact in the structure. In any case, the missing loops and termini could not conceivably have a role in filling these cavities .
Antibody interface The 17b antibody is a broadly neutralizing human monoclonal isolated from the blood of an HIV-infected individual. It binds to a CD4-induced (CD4i) gpl20 epitope that overlaps the chemokine receptor-binding site20.
Relative to other antibody-antigen pairs (Fig. lla-c) , the interface between Fab 17b and core gpl20 in the ternary complex involves a small area of interaction. The solvent accessible area excluded upon binding is only 455 A°2 from gpl20 and 445 A°2 from 17b, which is largely from the heavy chain (371A°2) . The long (15 residue) complementarity-determining region 3 (CDR3) of the heavy chain dominates, but the heavy-chain CDR2 and the light -chain CDR3 also contribute. Overall, the 17b contact surface is very acidic (3 Asp, 3 Glu, no Arg or Lys) although hydrophobic contacts (notably a cis proline and tryptophan from the light chain) predominate at the center.
On gpl20, the 17b epitope lies across the base of the four-stranded bridging sheet (Fig. He & e) . All four strands make substantial contact with 17b, suggesting that the integrity of the bridging sheet is necessary for 17b binding. The gpl20 surface that contacts 17b consists of a hydrophobic center surrounded by a highly basic periphery (3 Lys, 1 Arg, and no Asp or Glu) (Fig. lid) . Although this basic gpl20 surface complements the acidic 17b surface, only one salt bridge is observed (between Arg 419 of gpl20 and Glu 106 of the 17b heavy chain) . The rest of the specific contacts occur between hydrophobic and polar residues. Thus, the interaction between 17b and gpl20 involves a hydrophobic central region flanked on the periphery by charged regions, predominately acidic on 17b and basic on gpl20. There are no direct CD4-17b contacts and none of the gpl20 residues contacts both 17b and CD4. Rather, CD4 binds on the opposite face of the bridging sheet, providing specific contacts that appear to stabilize its conformation (Fig. lOi and lOj) and may explain in part the CD4-induction of 17b binding.
The 17b epitope is well conserved among HIV-1 isolates. Of the 18 residues that show loss in solvent accessible surface upon contact with 17b, 12 residues (67%) are conserved among all HIV-1 viruses. By contrast, only 19 of the 37 gpl20 residues (51%) that show loss of solvent accessible surface upon CD4 binding are similarly conserved. CD4i epitopes tend to be masked from immune surveillance by the adjacent V2 and V3 loops (see accompanying paper) . Indeed, in the complex structure, a large gap is seen between gpl20 and tips of the light-chain CDR1 and CDR2 loops. Pointing directly at this gap is the base of the V3 loop. In intact gpl20, the variable loops may need to be bypassed for access to the conserved structures in the bridging sheet . The 17b epitope may be further protected from the immune system by a CD4- induced conformational change (see below) .
Chemokine receptor site
The site of interaction with the chemokine receptor CCR5 overlaps with the 17b epitope30. Both are induced upon CD4 binding and both involve highly conserved residues. By mutational analysis, the basic and polar gpl20 residues (Lys 121, Arg 419, Lys 421, Gin 422) that contact the 17b heavy chain also are important for CCR5 interaction30. The hydrophobic and acidic surface of the 17b heavy chain may mimic the tyrosine-rich, acidic N-terminal region of CCR5 , which is important for gpl20 binding and HIV-1 entry (31, 32) . Geometrically, this site is directed at the cellular membrane when gpl20 is engaged by CD4. Electrostatic interactions between the basic surface of the bridging sheet and the acidic chemokine receptor (and possibly the acid headgroups in the target membrane) could drive conformational changes related to virus entry.
Oligomer and gp4l interactions
Although monomeric in isolation, gpl20 likely exists as a trimeric complex with gp41 on the virion surface. The large electroneutral surface on the inner domain (Fig. 10c) is the probable site of trimer packing based on its lack of glycosylation, its conservation in sequence, the location of CD4 and CCR5 binding sites, and the immune response to this region. These points are elaborated in the accompanying paper and a model is presented24.
A large body of mutagenic and antibody-binding analyses suggest that the N- and C-termini of full-length gpl20 are the most important regions for interaction with the gp41 glycoprotein (33, 34). From these analyses, we expect that gp41 interactive regions will extend away from core gpl20 toward the viral membrane, and that the conserved, electroneutral surface is occluded in the oligomer/gp41 interface. A similar arrangement is seen in influenza hemagglutinin, where the extended N- and C-termini of HA1 interact with the HA2 transmembrane protein (35) .
Conformational change in core gpl20
There is abundant evidence to suggest that CD4 binding induces a conformational change in gpl20. Much of this evidence, however, derives from intact gpl20 with variable loops in place or from the oligomeric gpl20:gp41 complex. The ternary complex structure provides clues to conformational changes within core gpl20 itself. (Although 17b binding could contribute to the gpl20 conformation observed in the crystal, the CD4 contacts are much more extensive and multifaceted than those of 17b. These observations argue that CD4 binding plays the major role in the formation of the observed gpl20 structure . )
Were the conformation of gpl20 seen here preserved in the absence of CD4 , the Phe 43 cavity (now a pocket) would present a perplexing structural dilemma. As discussed above, the cavity-lining residues have few structural restrictions, with ample room for larger substitutions into the cavity, yet these residues are highly conserved and inexplicably hydrophobic if exposed in a pocket . This pocket structure is in turn intimately connected to the bridging sheet, itself peculiar in absence of CD4. Thus, for example, the backbone amide of bridging-sheet residue 425 is hydrogen-bonded to Glu 370, a critical CD4 contact residue (Fig. 10j ) ; He 424 makes extensive hydrophobic contacts with Phe 382, which lines the pocket from the outer domain; and Trp 427 packs perpendicular to Trp 112, which lines the pocket from the inner domain (Fig. 10b) . NS of Trp 427 is delicately poised for hydrogen-bonding with the 7r-electrons of the indole ring of Trp 112. Structures such as these would necessarily be very sensitive to orientational shifts between the inner and outer domains .
The characteristics of 17b binding to core gpl20 provide additional evidence for a CD4-induced conformational change. We do not observe detectable binding of Fab 17b to core gpl20 unless CD4 is present, but then the ternary complex is stable in gel filtration. Since there are no direct CD4-17b contacts in the structure, the effect of CD4 must be to stabilize the bridging-sheet minidomain to which 17b binds. This result is compatible with the binding properties of 17b and other CD4i antibodies to full-length gpl20(18) (see accompanying paper), but it shows that the conformational change is not limited to an unmasking of the antibody epitope by CD4-induced of the V2 loop, as initially thought (36). The ability of the 17b antibody to bind full-length gpl20 in the absence of CD4 , albeit at a lower level, implies that structural elements required for 17b binding can be accessed in the absence of CD4. If we assume that 17b binds in the same way to both full-length and core gpl20, as shown by the concordance between the structural contacts (Fig. 11) and epitope mapping data, this suggests that alternative conformations are in a kinetically accessible equilibrium in native gpl20.
A further indication that core gpl20 may differ in the absence of CD4 comes from comparison with theory. When applied to the many known sequence variants of gpl20, the evolutionary algorithm of PHD37 gives secondary-structure predictions with 90% estimated reliability for roughly 45% of the core gpl20 sequence. Compared to our structure, it is accurate except at three places where it is markedly wrong (four consecutive residues with reliability index greater than 90%) . All of these are at the Phe 43 cavity or in contacts with CD4 : loop LB, strand 315, and the segment of 320 into the turn to /321.
(Fig. lOh) . Most significantly, the latter segment
(residues 422-429) entering the bridging sheet is predicted to be helical. Indeed, residues 427-428 at the 320-321 turn do have helical character. We also note that CD4 binds efficiently to a gpl20 derivative with both b2 and b3 truncated (38) . Since the bridging sheet is most likely not stable in the absence of half its strands, CD4 binding must possess the ability to properly orient strands /S20 and /321 from a very different prior conformation . The Phe 43 cavity is at the nexus of the CD4 interface, between the inner domain, the outer domain, and the bridging sheet. As such, Phe 43 itself seems to serve as a keystone without which the structure might collapse. If so, to what state and, in reverse, how does CD4 binding lead to the state seen in this ternary complex? Certainly, it is clear that CD4-gpl20 binding kinetics are complex (39), and microcalorimetric analysis reveals unusually large ΔH and compensating TΔS values for soluble CD4 binding to gpl20 (M. L. Doyle, personal communication) . These exceptional CD4 -binding thermodynamics imply a large conformational change and are similar for both full-length and core gpl20, which further supports the relevance of the structural observations on core gpl20. We imagine that CD4 sees gpl20 as an uneven equilibrium of conformational states, makes initial contact through electrostatic interactions
(Fig. 10c), stabilizes a nascent complex state, and inserts the Phe 43 to induce formation of the Phe 43 cavity.
Viral evasion of immune surveillance
Analysis of the antigenic structure of gpl20 shows that most of the envelope protein surface is hidden from humoral immune responses by glycosylation and oligomeric occlusion (accompanying paper) . Most broadly neutralizing antibodies generally access only two surfaces, one which overlaps the CD4 binding site (shielded by the V1/V2 loop) and the other which overlaps the chemokine receptor binding site (shielded by the V3 loop) . Conformational changes in core gpl20 provide additional mechanisms for evasion from immune surveillance. In the case of the CD4-binding surface, which contains a high proportion of mainchain atoms in the complex (Fig. lOf ) , the conformation without CD4 bound may expose underlying sidechain variability (Fig. lOg) . Escape may also be provided by the recessed nature of the binding pocket (steric occlusion) (Fig. 10a) and by a topographical surface mismatch, which encloses a variational island or "anti-hot spot" (described above, Fig. lOd) . Similar mechanisms may be found in the chemokine receptor region: conformational change may hide the conserved epitope (unformed prior to CD4 binding) ; steric occlusion may take place between the CD4 anchored viral spike and the proximal target membrane; and an "anti-hot spot" equivalent may camouflage chemokine-receptor binding residues on the V3 loop in surrounding variability. Some of the defenses used to elude antibody-based responses may also help HIV avoid cellular immunity. Understanding the specific gpl20 mechanisms of immune evasion may be prerequisite to the design of effective prophylaxis.
Mechanistic implications for virus entry
During virus entry, the HIV surface proteins function to fuse the viral membrane with the target cell membrane. The gpl20 glycoprotein plays roles crucial to the control and initiation of fusion. One set of roles concerns positioning: locating a cell capable of productive viral infection, anchoring the virus to the cell surface, and orienting the viral spike next to the target membrane. Another set concerns timing: holding the gp41 in a metastable conformation and triggering the coordinate release of the three N-terminal fusion peptides of the trimeric gp41. While it is clear that this is a complex multi-conformational process, the simplicity of the system, composed only of two membranes, the viral oligomer, and two host receptors, raises the possibility that we may be able to understand the entire mechanism. Crystallography has now provided two snapshots : an intermediate state in which gpl20 is bound to CD4 , described herein; and a probably final, "fusion-active" state of the gp41 ectodomain (40,41) . Although precise structural information is lacking for other intermediates, the vast biochemical data concerning the membrane fusion process mediated by the HIV-1 envelope glycoproteins allow us to extend our understanding from these two states .
The entry process is initiated by the binding of HIV-1 to the cellular receptor CD4 (Fig. 12, step 1). Although the extracellular portion of CD4 has some segmental flexibility, this binding roughly orients the viral spike. This orientation can be simulated by an alignment of the D1D2 CD4 in the ternary complex with the previously solved structure of the four-domain, entire extracellular portion of CD4(10) . Such alignment orients the N- and C- termini of core gpl20 towards the viral membrane, while the 17b epitope/chemokine receptor-binding site on the gpl20 surface faces the target cell membrane. Such an orientation is consistent with the proposed oligomeric structure and gp41-interactive surfaces described above.
CD4 binding also induces conformational changes in gpl20, which result in the creation of a metastable oligomer. Although some of the more flexible gpl20 regions and gp41 are missing, the structure of the core gpl20-CD4 complex presented here describes this state in atomic detail . CD4 binding results in movement of the V2 loop, which numerous experiments suggest partially occludes the V3 loop and CD4i epitopes (18, 36) . It also creates, or at least stabilizes, the bridging sheet on which these epitopes are located (described above for the core) . In addition, CD4 binding results in changes in the conformation of the V3 region, with the tip of the loop becoming more accessible, as judged by enhanced proteolytic susceptibility and altered exposure of V3 epitopes (19) . The V3 loop together with the uncovered epitopes comprise the chemokine-receptor binding site. Thus, CD4 binding not only orients the gpl20 surface implicated in chemokine receptor binding to face the target cell, but it also forms and exposes the site itself. We note that these changes may all result from a single, concerted shift in the relative orientation of the inner and outer domains. This conformational shift may alter the orientation of the N- and C- termini, at the proximal end of the inner domain, perhaps partially destabilizing the oligomeric gpl20/gp41 interface (21) . Such a shift would also alter the relative placement of the V1/V2 stem (in the CD4i site) , which emanates from the inner domain, and the V3 loop, which emanates from the outer domain. Interestingly, mutations that permit an adaptation of HIV-1 to CD4-independent entry using CXCR4 involve sequence changes in both the VI/V2 stem and the V3 loop (42) .
The next step in HIV-1 entry is the interaction of the gpl20-CD4 complex with the chemokine receptor (Fig. 12, step 2) . Although interactions between CD4 and chemokine receptor may occur, mutagenic analyses (H. Choe and J. Sodroski, unpublished observations) and the known examples of CD4-independent virus entry or chemokine-receptor binding suggest that direct gpl20 contacts dominate in the interaction with the chemokine receptor. Since most of the chemokine receptor is encased in the host membrane, binding would necessarily move the gpl20 bridging sheet close to the target membrane. This movement requires CD4 flexibility since the initial HIV binding at the N-terminal DI domains probably occurs above the glycocalyx. Reducing flexibility at the D2-D3 juncture or at the D4-membrane juncture of CD4 has been shown to block HIV-1 entry (10, 43) . Chemokine-receptor binding is believed to trigger additional conformational changes in the HIV-1 envelope glycoprotein trimer which lead to exposure of the gp41 ectodomain. Presumably, a signal is transmitted from the cell-associated distal end of gpl20 to elements of the inner domain that are likely to be involved in gpl20-gp41 or gpl20-gpl20 association on the trimer. Although further inter-domain shifts may occur in core gpl20 after chemokine-receptor binding, the geometrically specific contacts that support the bridging sheet make it unlikely that another shift could occur without destabilizing this important component of the chemokine-receptor binding site. Since the high affinity of interaction makes it likely that both CD4 and chemokine receptor remain bound to gpl20 during fusion, we expect that additional conformational changes probably occur between neighboring gpl20 protomers in the oligomeric complex. Perhaps the chemokine receptor triggers gp41 exposure by prising gpl20 protomers away from the trimer axis thus exerting a torque on the gpl20-gp41 interface. In this regard it is interesting that several of the substitutions that affect chemokine-receptor binding in the context of monomeric gpl20 appear to induce gpl20 dissociation in an oligomeric context (30).
The structure of the gpl20/CD4/l7b antibody ternary complex described here reveals some of the molecular aspects of HIV-1 entry, including the atomic structure of gpl20, the explicit interactions with CD4 , and the conserved site of binding for the chemokine receptor. Still unknown are details of the apo state of core gpl20, the oligomeric structure, the interaction with the chemokine receptor, the conformational changes that trigger the reorganization of the gp41 ectodomain and the structural basis for insertion of the fusion peptide of gp41 into the target membrane. Further understanding will require snapshots of other intermediates. The conformational complexity and observed intricate domain associations of gpl20, like those of reverse transcriptase (44) , the other large HIV translation product, may reflect genome restrictions at the protein level akin to those that lead to overlapping reading frames at the transcription level. Multiply protected infection machinery is contained in these condensed intricacies. Its mechanisms frustrate host defenses; understanding them may inspire medical intervention.
Methods
Protein production, crystallization, and data collection. The two-domain CD4 (D1D2, residues 1-182) was produced in Chinese hamster ovarian cells (8), the monoclonal antibody 17b in an Epstein-Barr virus immortalized B-cell clone isolated from an HIV-1 infected individual and fused with a murine B-cell fusion partner(18), and the core gpl20 from Drosophila Schneider 2 lines under control of an inducible metallothionein promoter (20) . The various biochemical manipulations (e.g. deglycosylation for the gpl20 and papain digestion to produced the Fab 17b) , protein purification, and ternary complex crystallization are described elsewhere (25) . The best crystals were small needles of cross-section only 30-40 μm. These were crosslinked with vapor diffusion glutaraldehyde treatment
(C J. Lusty, personal communication), equilibrated with cryoprotectant containing stabilizer (10% ethylene glycol with 10.5% monomethyl-PEG 5,000, 10% isopropanol, 50 mM NaCl, 100 mM Citrate/HEPES buffer pH 6.3), transferred into immiscible oil (Paratone-N; Exxon) , suspended in a small ethylene loop at the end of a mounting pin, and flash-frozen in a cryostat nitrogen stream at 100 K .
Diffraction data were collected at beamline X4A, Brookhaven National Laboratory, using phosphor image plates and a Fuji BAS2000 scanner. To avoid overlap problems from the relatively high mosaicity (-1.0°), oscillation data were collected using a rotation axis that was off-set at least 30° from the 197A c axis. Although crystals initially diffracted to Bragg spacing of greater than 2A, β axis mosaicity and substantial radiation damage despite cryogenic cooling reduced the overall resolution to 2.5A. Data processing and reduction were performed using DENZO and SCALEPACK (45) (Table 1) .
Structure determination and refinement. To locate the position of the Fab 17b in the ternary complex crystals, rotational searches with 52 different Fab models were made with the program MERLOT (P. M. Fitzgerald) . The Fabs were aligned by superposition of their variable domains to allow comparison of rotational solutions. Even though four models showed greater than 10% discrimination between highest and second highest solutions, no consistent rotational solution was found. Discrimination between correct and incorrect solutions was achieved by using confirmatory searches with the variable portion of the Fab. This was successful with only one model, molecule B of lhil. Rigid body refinement of the lhil solution (XPLOR(46)), allowing each immunoglobulin domain to move independently, produced a Patterson correlation of 24.9%. To locate the position of the two-domain CD4 , each of the top 100 possible rotational solutions with each of three different CD4 models (lcdi, lcdh, 3cd4) , were searched for a distinctive translation solution (AMoRe; J. Navaza) . The translation searches used the rigid body refined Fab as a partial structure to help discriminate the correct solution. Two distinctive solutions were found: the 25th rotational solution of 3cd4 gave a translation correlation of 0.171 (verses 0.128 for the second highest translation solution) , and the 61st rotational solution of lcdh gave 0.149 (verses 0.140). These two solutions were virtually identical . Rigid body refinement in XPLOR(46) gave a Patterson correlation of 7.9% for the CD4 alone and 32.4% for the Fab and CD4. All molecular replacement and rigid body refinements used 8-4A data.
To provide additional phasing, crystals were soaked in over 20 different heavy atom solutions and screened for isomorphous replacement using the statistical <chi>2 test in SCALEPACK (45) . Derivatives were identified from two heavy atom compounds : 10 mM K3IrCl6 (10 hr equilibration in heavy atom containing cryoprotectant stabilizer; 2.8A) and 5 mM K20sCl6 (24 hr soak; 3.5A) . Isomorphism was found to be highest between these heavy atom data sets and a native data set collected at pH 7.0 (cryoprotectant stabilizer buffered with 50 mM BisTris pH 7.0) . Heavy atom sites were identified by difference Fourier analysis using the molecular replacement phases, and phasing parameters were refined with MLPHARE (in the CCP4 suite of crystallographic programs) . The K3IrCl6 derivative was modeled as 9 partially occupied sites; two sites of occupancy 0.158 and 0.142, and 7 of less than 0.07. While relatively isomorphous, poor data quality (Rsym of greater than 20% past 3.0A) combined with relatively small isomorphous differences (Riso of 12.0%) reduced the quality of phasing. In contrast, the K20sC16 derivative had an Riso of 15.6%, but was only isomorphous to roughly 5A. It was modeled as 4 sites of occupancy 0.321, 0.207, 0.194 and 0.128, with the highest site at the same position as the second highest site from K3IrC16.
The initial combination of model and isomorphous replacement phasing did not produce readily interpretable density for gpl20. In order to monitor efforts at phase improvement, we devised an objective assay of density quality that used correlations in a region internal to domain 1 of CD4 between the experimental electron density and the calculated model density (CD4 as positioned by molecular replacement and rigid body refinement) . Refinement of heavy atom positions improved this correlation, and provided a starting point for phase improvement, primarily using real-space modification techniques (Table 1) . These techniques included automatic concatenation of the unmodeled density (with the program PRISM; D. Agard) , reciprocal-space averaging of the PRISM modeled density and real-space model subtraction (implemented using the XPLOR46 shell language) , application of real-space constraints such as solvent flattening, histogram matching and negative density truncation (with the program DM (in the CCP4 suite of crystallographic programs) , and real-space combinatorial addition of the various experimental density maps (with the program MAPMAN; T.A. Jones) . The combinatorial use of these techniques generated greatly improved electron density maps.
At this point, most of the carbon alpha backbone could be modeled (with the program 047) defining the secondary structure. Computer aided sequence alignment (slider routine in O) and secondary structure prediction (PHD37) helped to position the amino acid sequence leaving only regions around the N-terminus (residues 79-100 and residues 215-245) , the V1/V2 loop, and the V4 loop uncertain. Iterative rounds of building with 0, simulated annealing and positional refinement with XPLOR46, and addition of ordered solvent clarified the trace.
Structure analysis. Deviations of the CD4 structure in the complex from the free state were measured by the procedure of Wu et al.10. Deviations were taken as significant when the root mean square (rms) residue deviation was greater than the overall value and also more than 0.5u greater than variation among the free structures . Interatomic contacts were defined as in Zhu et . al.48. Structural alignments were made by visual comparison of the SCOP databas, and automatic searches were performed with PrISM (A.-S. Yang and B. Honig) .
References for the Third Series of Experiments
1. Barre-Sinoussi, F., et al . Isolation of a T-lymphotropic retrovirus from a patient at risk for acquired immunodeficiency syndrome (AIDS) . Science
220, 868-871 (1983) .
2. Gallo, R.C, et al . Frequent detection and isolation of cytopathic retroviruses (HTLV-HI) from patients with AIDS and at risk for AIDS. Science 224, 500-503
(1984) .
3. Kowalski, M.L., et al . Functional regions of the envelope glycoprotein of human immunodeficiency virus type 1. Science 237, 1351-1355 (1987) .
4. Lu, M. , Blackow, S. & Kim, P. A trimeric structural domain of the HIV-1 transmembrane glycoprotein. Nature Structural Biol. 2, 1075-1082 (1995).
Starcich, B.R., et al . Identification and characterization of conserved and variable regions of the envelope gene HTLV-III/LAV, the retrovirus of AIDS. Cell 45, 637-648 (1986) .
6. Leonard, C.K., et al . Assignment of intrachain disulfide bonds and characterization of potential glycosylation sites of the type 1 recombinant immunodeficiency virus envelope glycoprotein (gpl20) expressed in Chinese Hamster ovary cell. J. Biol.
Chem. 265, 10373-10382 (1990) .
7. Profy, A.T., et al . Epitopes recognized by the neutralizing antibodies of an HIV-1-infected individual. J Immunol. 144, 4641-4647 (1990) .
8. Ryu, S.-E., et al . Crystal structure of an HIV-binding recombinant fragment of human CD4. Nature 348, 419-426 (1990) .
9. Wang, J.H. , et al . Atomic structure of a fragment of human CD4containing two immunoglobulin-like domains.
Nature 348, 411-418 (1990) .
10. Wu, H., Kwong, P.D. & Hendrickson, W.A. Dimeric association and segmental variability in the structure of human CD4. Nature 387, 527-530 (1997).
11. Moebius, U. , Clayton, L., Abraham, S., Harrison, S. & Reinhertz, E. The human immunodeficiency virus gpl20 binding site of CD4 : Delineation by quantitative equilibrium and kinetic binding studies of mutants inconjunction with a high-resolution CD4 atomic structure. J. Exp. Med. 176, 507-517 (1992) .
12. Sweet, R.W, Truneh, A. & Hendrickson, W. A. CD4 : its structure, role in immune function and AIDS pathogenesis, and potential as a pharmacological target. Curr . Opin. Biotech. 2, 622-633 (1991) .
13. Olshevsky, U. , et al . Identification of individual HIV-1 gpl20 amino acids important for CD4 receptor binding. J. Virol. 64, 5701-5707 (1990) .
14. Cordonnier, A., Montagnier, L. & Emerman, M. Single amino acid changes in HIV envelope affect viral tropism and receptor binding. Nature 340, 571-574 (1989) .
15. Moore, J.P. Coreceptors : implications for HIV pathogenesis and therapy. Science 276, 51-52 (1997) .
16. Feng, F., Broder, CC, Kennedy, P.E. & Berger, E.A. HIV-1 entry co-factor: functional cDNA cloning of a seven-transmembrane, G protein-coupled receptor. Science 272, 872-877 (1996) .
17. Speck, R.F., et al . Selective employment of chemokine receptors as human immunodeficiency virus type 1 coreceptors determined by individual amino acids in the envelope V3 loop. J. Virol. 71, 7136-7139 (1997) .
18. Thali, M. , et al . Characterization of conserved human immunodeficiency virus type 1 (HIV-1) gpl20 neutralization epitopes exposed upon gpl20-CD4 binding. J. Virol. 67, 3978-3988 (1993).
19. Sattentau, Q.J., Moore, J.P., Vignaux, F., Traincard, F. & Poignard, P. Conformational changes induced in the envelope glycoproteins of human and simian immunodeficiency virus by soluble receptor binding. J. Virol. 64, 7383-7393 (1993).
20. Wu, L., et al . CD4- induced interaction of primary
HIV-1 gpl20 glycoproteins with the chemokine receptor CCR-5. Nature 384, 179-183 (1996).
21. Moore, J.P., McKeating, J.A., Weiss, R.A. & Sattentau, Q.J. Dissociation of gpl20 from HIV-1 virions induced by soluble CD4. Science 250, 1139-1142 (1990) .
22. Bullough, P.A., Hughson, F.M., Skehel, J.J. & Wiley, D.C. Structure of influenza haemagglutinin at the pH of membrane fusion. Nature 371, 37-43 (1994) .
23. Fass, D., et al . Structure of a murine leukemia virus receptor-binding glycoprotein at 2.0 anstrom resolution. Science 277, 1662-1666 (1997) .
24. Wyatt, R., et al . The antigenic structure of the human immunodeficiency virus gpl20 envelope glycoprotein. Nature , submitted (1998) .
25. Kwong, P.D., et al . Quantitative probablility analysis and variational crystallization of gpl20, the exterior envelope glycoprotein of the human immunodeficiency virus type 1 (HIV-1) . J. Biol. Chem. , submitted (1998) .
26. Binley, J.M. , et al . Analysis of the interaction of antibodies with a conserved, enzymatically deglycosylated core of the HIV-1 gpl20 envelope glycoprotein. AIDS Res. Hum. Retroviruses 14, 191-198 (1997) .
27. Leesong, M., Hederson, B.S., Gillig, J.R., Schwab, J.M. & Smith, J.L. Structure of a dehydratase-isomerase from the bacterial pathway for biosynthesis of unsaturated fatty acids: two catalytic activities in one active site. Structure
4, 253-264 (1996) .
28. Cedergren-Zeppezauer, E.S., Larsson, G. , Nyman, P.O., Dauter, Z. & Wilson, K.S. Crystal structure of a dUTPase. Nature 355, 740-743 (1992) .
29. Ryu, S.-E., Truneh, A., Sweet, R.W. & Hendrickson, W.A. Structures of an HIV and MHC binding fragment from human CD4 as refined in two crystal lattices. Structure 2, 59-74 (1994) .
30. Rizzuto, C, et al . Identification of a conserved human immunodeficiency virus gpl20 glycoprotein structure important for chemokine receptor binding. Science , submitted (1998) .
31. Dragic, T., et al . Amino-terminal substitutions in the CCR5 coreceptor impair gpl20 binding and human immunodeficiency type 1 entry. J. Virol. 72, 279-285 (1998) .
j 32. Farzan, M. , et al . A tryosine-rich region in the N-terminus of CCR5 is important for human immunodeficiency virus type 1 entry and mediates an association between gpl20 and CCR5. J. Virol. 72, 1160-1164 (1998) .
33. Wyatt, R. , et al . Analysis of the interaction of the human immunodeficiency virus type 1 gpl20 envelope glycoprotein with the gp41 transmembrane glycoprotein. J. Virol. 71, 9722-9731 (1997) .
34. Helseth, E., Olshevsky, U. , Furman, C & Sodroski, J. Human immunodeficiency virus type 1 gpl20 envelope glycoprotein regions important for assiciation with the gp41 transmembrane glycogprotein. J. Virol. 65, 2119-2123 (1991) .
35. Wilson, I.A., Skehel, J.J. & Wiley, D.C. Structure of the haemagglutinin membrane glycoprotein of influenza virus at 3 A resolution. Nature 289, 366-373 (1981) .
36. Wyatt, R. , et al . Involvement of the V1/V2 variable loop structure in the exposure of human immunodeficiency virus type 1 gpl20 epitopes induced by receptor binding. J. Virol. 69, 5723-5733 (1995) .
37. Rost, B., Sander, C & Schneider, R. PHD -- an automated mail server for protein secondary structure prediction. Comput . Appl. Biosci. 10, 53-60 (1994) . 38. Wyatt, R. , et al . Functional and immunologic characterization of human immunodeficiency virus type 1 envelope glycoproteins containing deletions of the major variable regions. J. Virol. 67, 4557-4565 (1993) .
39. Wu, H., et al . Kinetic and structural analysis of mutant CD4 receptors that are defective in HIV gpl20 binding. Proc. Natl. Acad. Sci. USA 93, 15030-15035 (1996) .
40. Chan, D.C, Fass, D., Berger, J.M. & Kim, P.S. Core structure of gp41 from the HIV envelope glycoprotein. Cell 89, 263-273 (1997) .
41. Weissenhorn, W. , Dessen, A., Harrison, S.C., Skehel, J.J. & Wiley, D.C. Atomic structure of the ectodomain from HIV-1 gp41. Nature 387, 426-430
(1997) .
42. Dumonceaux, J. , et al . Spontaneous mutations in the env gene of the human immunodeficiency virus type 1 NDK isolate are associated with a CD4- independent entry phenotype. J. Virol. 72, 519-519 (1998) .
43. Moir, S., Perreault, J. & Poulin, L. Postbinding events mediated by human immunodeficiency virus type 1 are sensitive to modifications in the D4-transmembrane linked region of CD4. J. Virol. 70, 8019-8028 (1996) .
44. Kohlstaadt, L.A., Wang, J., Friedman, J.M. , Rice, P. A. & Steitz, T.A. Crystal structure at 3.5u resolution of HIV-1 reverse transcriptase complexed with an inhibitor. Science 256, 1783-1790 (1992) .
45. Otwinowski , Z. & Minor, W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enymol. 276, 307-326 (1997) .
46. Brunger, A.T. (Yale University, New Haven, 1993) .
47. Jones, T.A., Zou, J.Y. , Cowan, S.W. & Kjeldgaard, M. Improved methods for binding protein models in electron density in electron density maps and the location of errors in these models. Acta Crystallogr. A 47, 110-119 (1991) .
48. Zhu, X., et al . Structural analysis of substrate binding by the molecular chaperone DnaK. Science 272, 1606-1614 (1996) .
49. Carson, M. Ribbons 2.0. J. Appl. Crystallogr. 24, 958-961 (1991) .
50. Nicholls, A., Sharp, K.A. & Honig, B. Protein folding and association: insight from the interfacial and thermodynamic properties of hydrocarbons. Proteins Struct. Funct . Genet. 11, 281-296 (1991) .
Table 1. Structure solution
Data Collection:
Native K3IrCl6 K20sCl6
Resolution limits (A) 20-2.5 20-2.8 20-3.5 Total observations 113,966 76,739 25, 821 Unique Observations 37,724 28,599 11, 982 Rsym (%)*t 9.3 (24.7) 11.5 (20. 2) 14.3 (18.2) Data coverage 86.0(62.. 90.8 (82.9) 72.5(62.5)
Molecular Replacement :
Fab CD4 Fab+CD4
Model lhil 3cd4/lcdh lhil+ lcdh
Scattering (%) 1 43 18 61 Rigid-body correlation" 0.249 0.079 0.325
Generation of experimental electron density:
Phasing Procedure Correlation coefficient
Molecular replacement (MR) -0.02
Multiple isomorphous replacement (MIR) 0.34
Phase combination:
MIR + MR 0.60
+ density modification 0.66
+ density modification + substraction 0.69 Density modelling (concatenation) :
MIR + MR 0.65
+ density modification 0.68
+ density modification + subtraction 0.71 Combination map addition: 0.73
Refinement Statistics :
R-factors (10-2.5 A) : Data cutoff (σ) 0 Rcrystal (Rfree) (%) 24.9(32.8) 22.2 (30.7) 21.2 (29.2) Data completenes (%) 85.8 77.3 66.4
Geometric parameters (rms) :
Bond length (A) 0.007
Bond angle (°) 1.59°
-factors : average rms bond rms angle mainchain 20. .80 1.33 2.31 sidechain 21, .93 1.97 3.01 waters 22. .31 * Rsym = ∑'Iobs-Iavg| /ΣIavg t Numbers in parentheses represents the statistics for the shell comprising the outer 10% (theoretical) of the data. 1 The percentage of scattering of the initial search model .
# Correlation obtained upon rigid-body refinement of the model against 8-4 ύ data.
I Correlation in the DI region of CD4 between the experimental electron density and the calculated model density (from CD4 as positioned by molecular replacement) using 10-2.8 ύ data. Correlations in this region (consisting of -6000U3) were used to generate a quantitative measure of the overall quality of the ternary complex experimental electron density. For the purposes of these calculations, the model used for phase combination omitted DI . A correlation of 0.6 is roughly the level of an interpretable protein electron density map, while a well refined structure would give a correlation of about 0.9.
Fourth Series of Experiments
Human immunodeficiency virus (HIV-1) establishes persistent infections in humans leading to the acquired immune deficiency syndrome (AIDS) . The HIV-1 envelope glycoproteins, gpl20 and gp41, are assembled into a trimeric complex that mediates virus entry into target cells (1) . HIV-1 entry depends upon the sequential interaction of the gpl20 exterior envelope glycoprotein with the receptors on the cell, CD4 and members of the chemokine receptor family (2-4) . The gpl20 glycoprotein, which can be shed from the envelope complex, elicits both virus-neutralizing and non-neutralizing antibodies during natural infection. Antibodies that lack neutralizing activity are often directed against the gpl20 regions occluded on the assembled trimer and exposed only upon shedding (5,6) . Neutralizing antibodies, by contrast, must access the functional envelope glycoprotein complex
(7) and typically recognize conserved or variable epitopes near the receptor-binding regions (8-11) . Here, we describe the spatial organization of conserved neutralization epitopes on gpl20, utilizing epitope maps in conjunction with the X-ray crystal structure of a ternary complex that includes a gpl20 core, CD4 and a neutralizing antibody (12) . A large fraction of the predicted accessible surface of gpl20 in the trimer is composed of variable, heavily glycosylated core and loop structures that surround the receptor-binding regions . Understanding the structural basis for the ability of HIV-1 to evade the humoral immune response should assist vaccine design.
In primary sequence, human and simian immunodeficiency virus gpl20 glycoproteins consist of five variable regions (VI-V5) interposed among mor e conserved regions
(13) . Variable regions VI-V4 form exposed loops anchored at their bases by disulfide bonds (14) . Neutralizing antibodies recognize both variable and conserved gpl20 structures. The V2 and V3 loops contain epitopes for strain-restricted neutralizing antibodies (15-17) . More broadly neutralizing antibodies recognize discontinuous, conserved epitopes in three regions of the gpl20 glycoprotein (Table 1) . In HIV-1 infected humans, the most abundant of these are directed against the CD4 binding site (CD4BS) and block gpl20-CD4 interaction (8,9) . Less common are antibodies against epitopes induced or exposed upon CD4 binding (CD4i) (18) . Both CD4i and V3 antibodies disrupt the binding of gpl20-CD4 complexes to chemokine receptors (10, 11) . A third gpl20 neutralization epitope is defined by a unique monoclonal antibody, 2G12, (19) which does not efficiently block receptor binding (11) .
In an accompanying article, (12) we report the X-ray crystal structure of an HIV-1 gpl20 core in a ternary complex with two-domain soluble CD4 and the Fab fragment of the CD4i antibody, 17b. The gpl20 core lacks the VI/V2 and V3 variable loops, as well as N- and C-terminal sequences, which interact with the gp41 glycoprotein, (6) and is enzymatically deglycosylated (12,21). Despite these modifications, the gpl20 core binds CD4 and antibodies against CD4BS and CD4i epitopes (21, 22) and thus retains structural integrity. The gpl20 core is composed of an inner domain, an outer domain and a third element, the "bridging sheet" (12) (Figure 14a) . All three structural elements contribute, either directly or indirectly, to CD4 and chemokine receptor binding (12) . Here, the organization of the surface of the gpl20 is analyzed in light of the known antibody responses directed against this exposed viral glycoprotein.
Variability and glycosylation of the qpl20 surface
Although generally well-conserved compared with the five variable regions, some variability in the surface of the gpl20 core is evident when the sequences of all primate immunodeficiency viruses are analyzed. This variability is disproportionately associated with the surface of the outer domain proximal to the V4 and V5 regions and removed from the receptor-binding regions (Figure 14a, b,c) . The LA, Lc, and LE surface loops (12) contribute to the variability of this surface. The potential N- linked glycosylation sites present in the gpl20 core are concentrated in this variable half of the protein (Figure 14, b and c) . In fact, the only conserved residues apparent on this relatively variable surface are asparagine 356 and threonine/serine 358, which constitute a complex carbohydrate addition site within the LE loop (Figure 14, b and c) . Since most carbohydrate moieties may appear as "self" to the immune system, the extensive glycosylation of the outer domain surface may render it less visible to immune surveillance. This helps to explain why antibodies directed against this gpl20 surface have been identified so infrequently.
The receptor-binding regions retained in the gpl20 core are well-conserved among primate immunodeficiency viruses (12) . Also highly conserved is the surface of the inner domain spanned by the α.1 helix and located opposite the variable surface described above (Figure 14d) . This surface is likely to interact with gp41 and/or with N- terminal gpl20 segments absent from the gpl20 core. This inner domain surface and the receptor-binding regions are devoid of glycosylation.
Conserved gpl20 neutralization epitopes
In conjunction with prior mutagenic and antibody competition analyses (5,6, 18-21), the gpl20 core structure reveals for the first time the spatial positioning of the conserved gpl20 neutralization epitopes. Although the major variable loops are either absent (V1/V2 and V3) are poorly resolved (V4) in the gpl20 core structure, their approximate positions can be deduced (Figure 15a) . The conserved gpl20 neutralization epitopes are discussed in relation to these variable loops and to the variable, glycosylated core surface. a) CD4i epitopes. The gpl20 epitope recognized by the CD4i antibody, 17b, can be directly visualized in the crystallized ternary complex (12) (Figure 15b, c) . Strands from the gpl20 fourth conserved (C4) region and the V1/V2 stem contribute to an antiparallel /3-sheet (the "bridging sheet" (see Figure 14a) ) that contacts the antibody. The vast majority of gpl20 residues previously implicated in formation of the CD4i epitopes (18) (Table 1) are located either within this /3-sheet or in nearby structures. With the exception of Thr 202 and Met 434, the gpl20 residues in contact with the 17b Fab are highly conserved among
HIV-1 isolates (Figure 14c, 15a) . The prominent ("male") CDR3 loop of the 17b heavy chain dominates the contacts with gpl20, with additional contacts through the heavy chain CDR2 (12) .
Unusually, there are minimal 17b light chain contacts, leaving a large gap between the gpl20 core and most of the 17b light chain surface. In the complete gpl20 glycoprotein, this gap is likely occupied by the V3 loop. This is consistent with the position and orientation of the V3 stem on the gpl20 core structure (12) , the effect of V3 deletions on the binding of CD4i antibodies in the absence of soluble CD4 (22), the competition of some V3- directed antibodies with CD4i antibodies (5) , and the ability of both antibody groups to block chemokine receptor binding (10,11). The chemokine receptor-binding region of gpl20 likely consists of elements near or within the "bridging sheet" and the V3 loop (Figure 14a) , a model that is supported by recent mutagenic analysis (C Rizzuto and J. Sodroski, submitted) .
The V2 loop likely resides on the side of the 17b epitope opposite the V3 loop (Figure 15a) . The V1/V2 loops, which vary from 57 to 86 residues in length (13), are dispensable for HIV-1 replication (22,27), but decrease the sensitivity of viruses to neutralization by antibodies against V3 and CD4i epitopes (27) . The latter effect is mediated primarily by the V2 loop (22) , suggesting that part of the V2 loop folds back along the V1/V2 stem to mask the "bridging sheet" and adjacent V3 loop. The proximity of the V2 and V3 loops is supported by the observation that, in monkeys infected with simian- human immunodeficiency viruses (SHIVs) , neutralizing antibodies are raised against discontinuous epitopes with V2 an V3 components (B. Etemad-Moghadam and J. Sodroski, submitted) . The CD4i epitopes are probably masked by the flanking V2 and V3 loops, requiring the evolution of antibodies with protruding ("male") CDRs to access these conserved epitopes. CD4 binding has been suggested to reposition the VI/V2 loops, thus exposing the CD4i epitopes (22) . The presence of contacts between the VI/V2 stem and CD4 in the crystal structure (12) is consistent with this model.
b) CD4BS epitopes . CD4 makes a number of contacts within a recessed pocket on the gpl20 surface.
The gpl20-CD4 interface includes two cavities, one water- filled and bounded equally by both proteins, the other extending into the gpl20 interior and contacting CD4 only at phenylalanine 43 (Figure 14a (12) . Table 1 and
Figure 15b, c show the gpl20 residues implicated in the formation of CD4BS epitopes recognized by eight representative antibodies. CD4BS epitopes are uniformly disrupted by changes in Asp 368 and Glu 370 (20) , which surround the opening of the "Phe 43 cavity." These residues are located on a ridge at the intersection of the two receptor-binding gpl20 surfaces, consistent with competition studies suggesting that CD4BS epitopes overlap both the CD4i epitopes and the binding site for CD4 (5,18) . The location of the gpl20 residues implicated in the formation of the CD4BS epitopes suggests that important elements of the CD4 -binding surface of gpl20 are accessible to antibodies.
Some CD4BS antibodies, like IgGlbl2, are particularly potent at neutralizing HIV-1 (23) . IgGlbl2 binding is disrupted by gpl20 changes that affect the binding of other CD4BS antibodies but, atypically, is sensitive to changes in the V1/V2 stem-loop structure (24) . The observation that some well-conserved residues in the gpl20 V1/V2 stem contact CD4 (12) raises the possibility that this protruding structure also contributes to the IgGlbl2 epitope. This might increase the ability of the antibody to access the assembled envelope glycoprotein trimer, thus increasing neutralizing capability.
While the CD4BS epitopes and the CD4 -binding site overlap, several observations demonstrate that the binding of CD4BS antibodies differs from that of CD4. Changes in Trp 427, a gpl20 residue that contacts both the "Phe 43 cavity" and CD4 , uniformly disrupt CD4 binding but affect the binding of only some CD4BS antibodies (Table 1) . Conversely, some changes in other cavity-lining gpl20 residues, Ser 256 and Thr 257, affect the binding of CD4BS antibodies more than the binding of CD4 (20) . Since the recessed position of Ser 256 and Thr 257 in the current crystal structure (Figure 15b, c) makes direct contacts with antibody unlikely, either the effects of changes in these residues are indirect or the CD4BS antibodies recognize a gpl20 conformation that differs from the CD4 -bound state. With respect to the latter possibility, it is interesting that several of the residues implicated in the integrity of the CD4BS epitopes are located in the interface between the inner and outer gpl20 domains. CD4BS antibodies might recognize a gpl20 conformation in which the spatial relationship between the domains is altered compared with the CD4 -bound state, thus allowing better surface exposure of these residues. Differences between the CD4BS epitopes and the CD4 -binding site create opportunities for neutralization escape (20). The gpl20 residues surrounding the "Phe 43" cavity are highly conserved among primate immunodeficiency viruses (Figure 15a), but the observed modest variation in adjacent surface-accessible residues (e.g., Pro 369, Thr 373 and Lys 432) could account for decrease recognition of the gpl20 glycoprotein from some geographic clades of HIV-1 by CD4BS antibodies (24) . Additional potential for variation near or within the CD4BS epitopes is created by the unusual water- filled cavity in the gpl20-CD4 binding interface, since CD4 binding can apparently tolerate change in the gpl20 residues contacting this cavity (12) .
The recessed nature of the CD4 binding pocket on gpl20 (Figure 14c) may delay the generation of high-affinity antibodies against the CD4BS epitopes and may afford opportunities to minimize the antiviral efficacy of such antibodies once they are elicited. The degree of recession is probably much greater on the full-length, glycosylated gpl20 than is evident on the crystallized gpl20 core. The recessed pocket is flanked on one side by the V1/V2 stem-loop structure. The characterization of HIV-1 escape mutants from the IgGlbl2 CD4BS antibody and the mapping of several V2 conformational epitopes support a model in which the V2 loop folds back along the V1/V2 stem, with V2 residues 183-188 proximai to Asp 368 and Glu 370. This model is consistent with observations that VI/V2 changes, in combination with V3 changes, can alter the exposure of the adjacent CD4BS epitopes, particularly on the assembled trimer (28) . The high temperature factors associated with the V1/V2 stem (12) imply flexibility in this protruding element (Figure 14c, d), expanding the potential range of space occupied by the V1/V2 stem-loop structure. This could enhance masking of the adjacent CD4BS and CD4i gpl20 epitopes and divert antibody responses towards the variable loops.
Glycosylation may modify the interaction of antibodies with CD4BS epitopes. The LD loop, on the rim of the CD4- binding pocket opposite the VI/V2 stem, contains a well- conserved glycosylation site, asparagine 276 (Figure 14c) . Changes in this site and at the adjacent alanine 281 have been associated with escape from the neutralizing activity of patient sera (25) and have been seen in SHIVs extensively passaged in monkeys (26) . Another conserved glycosylation site at asparagine 386 lies adjacent to both CD4BS and CD4i epitopes (Figure 14c) and could diminish antibody responses against those sites. Additionally, in various HIV-1 strains, carbohydrates are added to the V2 loop segment (residues 186-188) thought to be proximal to the CD4BS epitopes.
c) The 2G12 epitope. The integrity of the 2G12 epitope is disrupted by changes in gpl20 glycosylation, either by glycosidase treatment or mutagenic alteration of specific N- linked carbohydrate addition sites (19) . These sites are located on the relatively variable surface of the gpl20 outer domain, opposite to and approximately 25 A away from the CD4 binding site (Figure 15b, c) . The gpl20 glycoprotein synthesized in mammalian cells exhibits a dense concentration of high-mannose sugars in this region (Figure 15a) . Even in the enzymatically deglycosylated gpl20 core, carbohydrate residues constitute much of this surface. 2G12 likely binds at least in part to these carbohydrates, explaining the surprising conservation of the 2G12 epitope despite the variability of the underlying protein surface, which includes the stem of the V3 loop and the V4 variable region. The inclusion of carbohydrate in the epitope might also explain the apparent rarity with which these antibodies are generated. The localization of the 2G12 epitope is consistent with previous studies indicating that 2G12 forms a unique competition group (5,19) and does not interfere with the binding of monomeric gpl20 to either CD4 or chemokine receptors (11) . Since the 2G12 epitope is predicted to be oriented towards the target cell upon CD4 binding (see below) , the antibody may sterically impair interactions of the oligomeric envelope glycoprotein complex with host cell moieties.
Orientation of qpl20 in the trimer
Possible orientations of the exterior glycoproteins in the trimer are significantly constrained by the requirement that observed and deduced binding sites for receptors and neutralizing antibodies, sites of N-linked glycosylation, and variable structures be exposed on the surface of he assembled complex. The two-domain CD4 in the ternary complex structure was aligned to the structure of four-domain CD4 (29) to orient the trimer model with respect to the target cell membrane. The consequences of such a model (Figure 16) are: a) the chemokine receptor-binding sites are clustered at the vertex of the trimer predicted to be closest to the target cell; b) both variable and conserved neutralization epitopes are concentrated on the half of gpl20 facing the target cell; c) possibilities for intersubunit interactions among the variable structures that could help mask conserved neutralization epitopes are created; d) the subset of gpl20 glycosylation sites to which complex carbohydrates are added in mammalian cells (14) is well-exposed on the outer periphery of the trimer; e) the highly conserved surface near the otl helix is available for gp41 and/or gpl20 protein interactions within the trimers; and f) the surface of the assembled envelope glycoprotein complex is roughly hemispherical, thus minimizing the surface area of the viral spike that is potentially exposed to antibodies .
In summary, the X-ray crystal structure of the gpl20 core/two-domain CD4/17b Fab complex provides a framework for visualizing key interactions between HIV-1 and the humoral immune system. Previous antibody competition analyses suggested that the gpl20 surface buried in the assembled trimer elicits non-neutralizing antibodies
(5,6) . By contrast, the binding sites for neutralizing antibodies cluster on a different gpl20 surface (5) . Our structural studies support the existence of non- neutralizing and neutralizing faces of gpl20, and reveal another, immunologically "silent" face of the glycoprotein (Figure 15d) . This outer domain surface, along with the major variable loops, contributes to the large fraction of the gpl20 surface that is protected against antibody responses by a dense array of carbohydrates and by the capacity for variation. The conserved receptor-binding regions of gpl20 represent attractive targets for immune intervention. However, the elicitation of antibodies against these conformation- dependent structures is inefficient. Since the gpl20 epitopes near the receptor-binding regions span the inner and outer domains, interdomain conformational shifts may decrease their representation in the immunogen pool . The recessed nature of the CD4 -binding site likely contributes to its poor immunogenicity. The sequential recognition of two receptors by primate immunodeficiency viruses allows the conserved elements of the chemokine receptor-binding site to be created or exposed only after CD4 binding has occurred. At that point, it is likely that the proximity of the chemokine-receptor binding site to the cell membrane sterically limits antibody binding. The evolution of primate immunodeficiency viruses that successfully persist despite the host immune response presents challenges to vaccine development. An understanding of the structures of the relevant gpl20 epitopes should assist efforts to overcome these hurdles.
Material and Methods
Graphics . Molecular graphics were produced using Midas- Plus (University of California, San Francisco) and GRASP (30) .
Assignment of variability. Variability in gpl20 residues was assessed using an alignment of sequences derived from approximately 400 HIV-1, HIV-2 and simian immunodeficiency viruses (13) . Residues were assigned variability indices and color coded as follows:
Red : conserved in all primate immunodeficiency viruses; Orange: conserved in all HIV-1, including groups
M and 0 and chimpanzee isolates; Yellow: some variation among HIV-1 isolates (divergence from the consensus sequence in 1-8 of the 12 HIV-1 groups examined) ; Green : variable among HIV-1 isolates (divergence from the consensus sequence in > 9 of the 12 HIV-1 groups examined) .
Molecular modeling. Residues 88, 89, and 397-409, which are disordered in the ternary complex crystals (12) , were built manually using the program TOM. For the V4 loop
(residues 397-409) , a dominant constraint was the distance between the ordered residues 396 and 410 (Co. -
Cc distance of 26.88 A) . For the carbohydrate, examination of the N-linked carbohydrate in several crystal structures (e.g. Ifc2, lgly, lite) showed that the core common to both high-mannose and complex N-linked sugars, (NAG) 2 (MAN) 3 did not differ greatly in conformation after alignment of the first NAG. This core, which represents roughly half the total glycosylation for a typical N-linked site, was built onto each of the 18 consensus N-linked glycosylation sites found on the HXBc2 gpl20 core. The stereochemistry of this initial model was refined using simulated annealing in XPLOR. Briefly, the model was heated to between 2,500° and 3,500°K, and "slow cooled" in steps of 25° to 300°K. At each step, molecular dynamics were performed with the core gpl20 fixed, allowing only the modeled residues and carbohydrate (including any attached Asn) to move. The three separate runs, performing molecular dynamics for 5 fs/step, all steric clashes could be removed and the geometry idealized, with an average root mean square (RMS) of carbohydrate movement of only -3.5A. Four subsequent runs were made using dynamic times of between 50-75 fs/step. The carbohydrate positions obtained from these runs differed more substantially from those in the starting model (average carbohydrate RMS difference of roughly 8 A) . Two of the models from these longer annealings were much more similar to each other than to the rest (RMS differences in carbohydrate of ~ 4 A versus ~ 8 A for all other models) . One had been heated to 3,500°K with dynamics of 75 fs/step. The other (shown in the figures display here) was heated to only 2,500°K with dynamics of 50 fs/step. In general the RMS movement of the NAG sugars was roughly half the RMS movement of the MAN sugars, reflecting greater conformational flexibility further from the protein surface.
References for the Fourth Series of Experiments
1. Allen, J. el al . Identification of the major envelope glycoprotein product of HTLV-II. Science 22^, 1091-1094 (1983) .
2. Dalgleish, A.G. et al . The CD4 (T4) antigen is essential component of the receptor for the AIDS retro virus. Nature 312, 763-767 (1984).
3. Klatzmann, D. et al . T-Iymphocyte T4 molecule behaves as the receptor for human retro virus LAV. Nature 312, 767-768 (1984) .
4. Feng, Y. , Broder, C C , Kennedy, P.E. and Berger, E. HIV-1 entry cofactor: Functional cDNA cloning of a seven-transmembrane, G protein-coupled receptor. Science 272, 872-877 (1996) .
5. Moore, J.P. and Sodroski, J. Antibody cross- competition analysis of the human immunodeficiency virus type 1 gpl20 exterior envelope glycoprotein. J. Virol. 1Q_, 1863-1872 (1996) .
6. Wyatt, R. et al . Analysis of the interaction of the human immunodeficiency virus type 1 (HIV-1) gpl20 envelope glycoprotein with the gp41 transmembrane glycoprotein. J. Virol. , 71, 9722-9731 (1997) .
7. Sattentau, Q.J. and Moore, J.P. Human immunodeficiency virus type 1 neutralization is determined by epitope exposure on the gpl20 oligomer. J. Exp.Med. 182, 185-196 (1995) .
8. Posner, M. et al . An IgG human monoclonal antibody which reacts with HIV-1 gpl20, inhibits virus binding to cells, and neutralizes infection. C_ Immunol.146 , 4325-4332 (1991) .
9. Ho, D. et al . Conformational epitope on gpl20 important in CD4 binding and human immunodeficiency virus type 1 neutralization identified by a human monoclonal antibody. J. Virol. 65, 489-493 (1991) .
10. Wu, L. et al . CD4-induced interaction of primary HIV-1 gpl20 glycoproteins with the chemokine receptor CCR5. Nature 384, 179-183 (1996).
11. Trikola, A. et al . CD4-dependent , antibody-sensitive interactions between HIV-1 and its co-receptor CCR- 5. Nature 384, 184-187 (1996).
12. Kwong P. et al . Nature, submitted.
13. Myers, G. et al . Human retro viruses and AIDS. A compilation and analysis of nucleic acid and amino acid sequences. Los Alamos National
Laboratory. Los Alamos, N.M. 1996.
14. Leonard, C. et al . Assignment of interchain disulfide bonds and characterization of potential glycosylation sites of the type 1 human immunodeficiency virus envelope glycoprotein (gpl20) expressed in Chinese hamster ovary cells. J. Biol. Chem. 265, 10373-10382 (1990) .
15. Fung, M.S.C et al . Identification and characterization of a neutralization site within the second variable region of human immunodeficiency virus type 1 gpl20. J. Virol. 66, 848-856 (1992).
16. Putney, S. et al . HTLV-H/LAV-neutralizing antibodies to an E. coli-produced fragment of the virus envelope. Science 234 , 1392-1395 (1986) . 17. Rusche, J.R. et al . Antibodies that inhibit fusion of human immunodeficiency virus -infected cells bind a 24-amino-acid sequence of the viral envelope gpl20. Proc. Natl. Acad. Sci. USA 85., 3198-3202 (1988) .
18. Thali, M. et al . Characterization of conserved human immunodeficiency virus type 1 gpl20 neutralization epitopes exposed upon gpl20-CD4 binding. J. Virol. 67, 3978-3988 (1993) .
19. Trkola, A. et al . Human monoclonal antibody 2G12 defines a distinctive neutralization epitope on the gpl20 glycoprotein of human immunodeficiency virus type 1. J. Virol. 70, 1100-1108 (1996) .
20. Thali, M. et al . Discontinuous, conserved neutralization epitopes overlapping the CD4 binding region of the HIV-1 gpl20 envelope glycoprotein. J_^ Virol, 66, 5635-5641 (1992) .
21. Binley, J. et al . Analysis of the interaction of antibodies with a conserved, enzymatically deglycosylated core of the HIV-1 gpl20 envelope glycoprotein. AIDS Res. Hum. Retro viruses, 14 ,
191-198 (1997) .
22. Wyatt, R. et al . Involvement of the V1/V2 varaible loop structure in the exposure of human immunodeficiency virus type 1 gpl20 epitopes induced by receptor binding. J. Virol. 69 , 5723-5733 (1995) .
23. Robe, P. et al . Recognition properties of a panel of human recombinant Fab fragments to the CD4 binding site of gpl20 that show differing abilities to neutralize human immunodeficiency virus type 1. J. Virol. 68, 4821-4828 (1994) .
24. Moore, J.P. et al . Exploration of antigenic variation in gpl20 form clades A through F of human immunodeficiency virus type 1 by using monoclonal antibodies. J. Virol. 68, 8350-8364 (1994).
25. Watkins, B.A. et al . Immune escape by human immunodeficiency virus type 1 from neutralizing antibodies: evidence for multiple pathways. J.
Virol 67, 7493 (1993) .
26. Karlsson, G. et al . Characterization of molecularly cloned simian-human immunodeficiency viruses causing rapid CD4+ lymphocyte depletion in rhesus monkeys.
J. Virol. 71, 4218 (1997) .
27. Cao, J. et al . Replication and neutralization of human immunodeficiency virus type 1 lacking the V1/V2 variable loops of the gpl20 envelope glycoprotein. J. Virol. 71, 9808-9812 (1997) .
28. Wyatt, R. et al . Functional and immunologic characterization of human immunodeficiency virus type 1 envelope glycoproteins containing deletions of the major variable regions. J. Virol. 67 , 4557-4565 (1993) .
29. Wu, H., Kwong, P.D. and Hendrickson, W.A. Dimeric association and segmental variable in the structure of human CD4. Nature 387, 527-530 (1997) .
30. Nicholls, A., Sharp, K.A. and Honig, B. Protein folding and association: from the interfacial and thermodynamic properties of hydrocarbons. Proteins 11, 281-296 (1991) . Table X. Conserved Epitopes for Neutralizing Antibodies Identified on the gpl20 Core
VO
I
Figure imgf000197_0002
Figure imgf000197_0001
CO 03 CO
rπ CO
I m I
Figure imgf000198_0002
a m The gpl20 competition groups are defined as Reference 5. b
The gpl20 amino acids are numbered according to the sequence of the HXBc2 (IIIB) gpl20 cn glycoprotein, where residue 1 is the methionine at the amino-terminus of the signal peptide. Changes in the amino acids listed resulted in significant reduction in antibody binding to the gρl20 glycoprotein (Ref. 18-20). The numbers in parentheses indicate the percentage of the C04BC antibodies examined whose binding is decreased by changes in the indicated residue.
Figure imgf000198_0001
Fifth Series of Experiments
The entry of primate immunodeficiency viruses into target cells depends upon a sequential interaction of the gpl20 envelope glycoprotein with the cellular receptors, CD4 and members of the chemokine receptor family. The gpl20 third variable (V3) loop has been implicated in chemokine receptor binding, but the use of the CCR5 chemokine receptor by diverse primate immunodeficiency viruses suggests the involvement of an additional, conserved gpl20 element. Here we identify a highly conserved gpl20 structure that is critical for CCR5 binding, is located adjacent to the V3 loop, and contains neutralization epitopes induced by CD4 binding. This conserved element may be a useful target for pharmacologic or prophylactic intervention in immunodeficiency virus infections.
The clinically abundant primate immunodeficiency viruses behind the -chemokine receptor CCR5 as an obligate step in virus entry into target cells (1,2) . The gpl20 glycoproteins of primary, macrophage-tropic HIV-1 strains have been shown to bind specifically to cells expressing CCR5(3,4). The affinity of gpl20 binding was increased 2-3 logs by the presence of soluble CD4 (sCD4) (3) . Efficient CCR5 binding was dependent upon the presence of the V3 variable loop of gpl20, but the gpl20 V1/V2 variable loops and N-and C- termini were dispensable for high-affinity binding to CCR5(3). No significant CCR5 binding was observed for gpl20 glycoproteins derived from laboratory-adapted HIV-1 isolates, which do not use CCR5 as a coreceptor (3,4) .
Specific groups of HIV-1 neutralizing antibodies directed against the gpl20 V3 loop or CD4- induced (CD4i) epitopes were able to block the binding of gpl20-sCD4 complexes to CCR5 -expressing cells (3,4) . The CD4i epitopes are conserved, discontinuous gpl20 structures that are exposed better after CD4 binding (5) . Mutagenic analysis suggested that elements of the conserved stem of the V1/V2 stem-loop and of the fourth conserved region of gpl20 comprise the CD4i epitopes (5) . Here we test the hypothesis that conserved gpl20 residues near or within the CD4i epitopes are critical for CCR5 binding.
An assay was established that could assess the CCR5- binding ability of a panel of HIV-1 gpl20 glycoproteins mutants. The mutants were created by the introduction of single amino acid changes in gpl20 residues near or within regions previously shown to be important for the integrity of the CD4i epitopes (5) . During the course of this work, structural information on the gpl20 epitope recognized by a CD4i-directed antibody, 17b, became available (6) (see below) and was used to guide the mutagenesis. The wtΔ glycoprotein, which lacks the V1/V2 variable loops and the N-terminus and is derived from the YU2 primary macropage-tropic HIV-1 isolate (7) , was the starting point for the studies (Fig. 17) . This protein was chosen because it had been shown to bind CD4 and CD5 with high affinity (3,8,9). Furthermore, the use of this protein minimized the opportunities for indirect effects of gpl20 amino acid changes on CCR5 binding (e.g., by repositioning the V1/V2 loops, which can mask CD4i epitopes (9) . Metabolically labeled wtΔ and mutant derivatives were produced in 293T cells and incubated with mouse Ll .2 cells stably expressing human CCR5(3), in either the absence or presence of sCD4. The cells were washed and lysed, and bound gpl20 protein was detected by precipitation with a mixture of sera from HIV-1 infected individuals (10) .
The wtΔ protein efficiently bound to the Ll .2 CCR5 cells in the presence of sCD4 (Fig. 18, A and B) . Binding was dramatically reduced when sCD4 was not present in the assay. The wtΔ protein binding to the L1.2-CCR5 cells was inhibited by preincubation of the wtΔ protein with the 17b antibody. Binding was also inhibited by incubation of the L1.2-CCR5 cells with the 2D7 antibody against CCR5 (11) or with the CCR5 ligand, MIP-1/3 (12) . The Cll antibody, which is directed against a gpl20 region dispensable for CCR5 binding (3), did not block the binding of the wtΔ protein to the L1.2-CCR5 cells (data not shown) . The wtΔ protein did not bind appreciably to the parental Ll .2 cells not expressing CCR5, even in the presence of sCD4. These results suggest that the wtΔ protein binds CCR5 in a specific, CD4 -dependent manner .
The binding of the panel of gpl20 mutants to the L1.2- CCR5 cells in the absence and presence of sCD4 was measured. The recognition of the mutant proteins by sCD4 and by monoclonal antibodies that recognize discontinuous gpl20 epitopes (5,13) was assessed in parallel (10) . Changes in several gpl20 amino acids resulted in dramatic reductions in the ability of the protein to bind to L1.2-
CCR5 cells in the presence of sCD4 (Table 1 and Fig. 18C) . In some cases (257 T/D, 370 E/Q and 383 F/S) , the attenuated CD4 -binding ability of the mutant proteins could account for the observed reduction in binding to the L1.2-CCR5 cells. In most cases, however, the mutant proteins that were deficient in CCR5 binding still bound sCD4 and at least one of the monoclonal antibodies recognizing discontinuous gpl20 epitopes. As expected, some of the introduced amino acid changes decreased recognition by the 17b antibody. Interestingly, two of the gpl20 amino acid changes (437 P/A, 442 Q/L) resulted in an increase in CCR5 binding compared with the wtΔ protein, even though CD4 binding was not significantly increased. In the absence of sCD4 , the 437 P/A and 442 Q/L envelope glycoprotein mutants bound to the L1.2-CCR5 cells slightly better than the other mutants and the wtΔ protein, which exhibited very low levels of binding (Fig. 18A and data not shown) .
Recently, the structure of an HIV-1 gpl20 core crystallized in a ternary complex with two-domain CD4 and the 17b Fab has been solved (6) . The gpl20 core is composed of an inner domain, an outer domain, and a "bridging sheet" (Fig. 19A) . The "bridging sheet" is a four-stranded, antiparallel -sheet that includes the V1/V2 stem and strands (320 and 021) derived from the fourth conserved gpl20 region. CD4 contacts gpl20 residues in the outer domain and the "bridging sheet"
(6) . The gpl20 residues implicited by our study in CCR5 binding are located near or within the "bridging sheet"
(Figure 19, A and B. The "bridging sheet" is predicted to face the target cell after the envelope glycoproteins bind CD4 (6) . Even more than the CD4 -binding site, the gpl20 region implicated in CCR5 binding is highly conserved among primate immunodeficiency viruses; this is particularly apparent in comparison to the remainder of the gpl20 surface thought to be exposed on the assembled envelope glycoprotein complex (Fig. 19C) (6) . The CD4i epitope for the 17b antibody is located near or within the "bridging sheet" (6), consistent with the ability of the antibody to block CCR5 binding (3,4). All of the individual gpl20 residues in which changes disrupted recognition by the 17b antibody (Fig. 19D) are located closed to the gpl20-17b interface in the crystallized complex (Table 1) . The binding of another antibody, CG10, which disrupts gpl20-CCR5 interaction (3) and competes with the 17b antibody for gpl20 binding (14) , is also affected by changes in amino acid residues within or near the "bridging sheet" (Fig. 19E) . The position and orientation of the V3 base in the structure (6) , in conjunction with a number of mutagenic and antibody competition studies (15) , suggest that the gpl20 V3 loop resides proximal to the region implicated in CCR5 binding (Fig. 19A) . For example, the binding of both CG10 and CD4i antibodies to gpl20 can be disrupted by some V3 changes (5,14) . Furthermore, several V3 -directed antibodies compete with CD4i antibdoies for gpl20 binding (15) .
Our observations suggest that the CCR5-binding site is likely composed of conserved gpl20 elements near or within the "bridging sheet" and V3 loop residues. The latter might include more conserved structures (e.g. the aromatic or hydrophobic residue at position 317, altered in this study) as well as more variable structures (16) that determine the specific chemokine receptor used. Some of the gpl20 residues identified in this and previous studies (16) as determinants of chemokine receptor utilization could modulate the interaction of the V3 loop and elements near the "bridging sheet" . For example, studies of HIV-1 revertants (15) suggested a functional interaction of gpl20 residue 440, shown here to influence CCR5 binding, with the V3 loop.
A subset of the gpl20 residues in or near the "bridging sheet" likely contacts CCR5 directly. Most of the gpl20 residues implicated in CCR5 binding exhibit reasonable solvent accessibility in the free gpl20 core (Table 1) , consistent with this possibility. The gpl20 surface implicated in CCR5 binding is highly basic (6) , potentially favoring interactions with the acidic CCR5 amino terminus, which has been shown to be important for gpl20 binding (17,18). Additionally, hydrophobic interactions, similar to those seen for gpl20-17b binding (6) , may also contribute to the gpl20-CCR5 interaction.
The exposure and/or formation of the CCR5-binding site of HIV-1 gpl20 glycoproteins is dependent upon interaction with CD4 (3,4) . CD4 binding has been shown to reposition the V1/V2 variable loops and thus expose the CD4i epitopes (9) , which overlap the CCR5-binding region (3,4) . However, since a gpl20 glycoprotein lacking the VI and V2 variable loops also exhibits CD4 -dependent CCR5 binding (3), the interaction with CD4 must cause other conformational changes in gpl20 related to the CCR5- binding site. Our results, which highlight the proximity of the two receptor-binding sites on gpl20, provide likely explanations for the induction of such conformational changes. First, one of the components of the "bridging sheet", the V1/V2 stem, also contacts CD4(6). Thus, CD4 binding, which appears to distort the V1/V2 stem, may reposition this structure and allow the formation of the β-sheet important for CCR5 binding. In this respect, we note that a substitution of aspartic acid for threonine 123, which is located in the V1/V2 stem and contacts CD4 , significantly decreases CCR5 binding. This substitution may disrupt CD4-induced conformational changes in the V1/V2 stem required for CCR5 binding. Second, the CD4 -bound conformation of gpl20 exhibits a cavity (the "Phe 43" cavity) within the gpl20 interior (6). This cavity contacts the gpl20 inner and outer domains as well as the "bridging sheet" and likely forms as a result of interdomain conformational changes in gpl20 induced by CD4 binding (6) . Since the "bridging sheet" lacks its own hydrophobic core and is thus dependent upon residues contributed by both inner and outer domains (6), any shift in orientation between these domains would alter the conformation of the "bridging sheet" . Furthermore, CD4 binding could also alter the precise orientation of the "bridging sheet" with respect to the inner and outer domains, thus aligning the V3 loop and conserved gpl20 elements important for CCR5 binding. To summarize, CD4 binding likely induces conformational changes within the "bridging sheet" as well as between this sheet and the inner and outer domains to form the high-affinity CCR5 binding site. For some primate immunodeficiency viruses, the CD4-bound conformation of gpl20 must be energetically assessable in the absence of CD4 to explain the documented examples of CD4-independent chemokine receptor binding and entry (18,19) .
It is likely that the CCR5-binding region defined in this study is also important for the binding of simian and human immunodeficiency viruses to other chemokine receptors. The identified region exhibits one of the most highly conserved surfaces on the HIV-1 gpl20 glycoprotein (6), supporting its functional importance for all primate immunodeficiency viruses. The laboratory-adapted HXBc2 envelope glycoprotein, which uses CXCR4 and not CCR5 as a coreceptor (1,2,20), can be converted to an efficient CCR5 -using protein simply by substituting the V3 loop of the YU2 virus (2) . Thus, all of the CCR5-binding region outside of the V3 loop must be conserved, at least between the HXBc2 and YU2 viruses. Indeed, we have shown that alteration of the lysine 117, lysine 207 and glycine 441 in the HXBc2-YU2V3 chimeric protein also disrupts CCR5 binding (21) . Consistent with the use of this region for the binding of other chemokine receptors is the observation (19) that the gpl20 changes associated with the conversion of HIV-2 to a CD4- independent, CXCR4 -using virus affect the "bridging sheet" and the V3 loop. Alterations in "bridging sheet" residues have also been implicated in changes in the tropism of HIV-1 for immortalized cell lines that do not express CCR5(22) . Finally, the 17b antibody neutralizes HIV-1 strains that use different chemokine receptors (5, 14) , supporting the involvement fo a common gpl20 region in chemokine receptor interaction.
Chemokine receptor binding may trigger additional conformatinal changes in the envelope glycoprotein complex that ultimately lead to the fusion of the viral and target cell membrane. It is believed that some of these changes include exposure of the ectodomain of the gp41 transmembrane envelope gylcoprotein (23 ) . It is interesting that the CCR5-binding region defined herein likely resides closes to the trimer axis of the assembled envelope gycoprotein complex(6). Indeed, some of the gpl20 residue changes that affect CR5 binding also affect the non-covalent association of gpl20 and gp41 subunits in the trimeric complex (21) . These observations raise the possibility that chemokine receptor binding alters the relationship between gpl20 and gp41, leading to the exposure of the gp41 ectodomain and interaction with the target cell membrane.
The definition of a highly conserved gpl20 structure that this important for binding to CCR5 , the major coreceptor used by clinically abundant primate immunologic inhibitors of virus-receptor interactions. An understanding of the CD4- induced conformational changes in this structure may allow the targeting of sucg inhibitors to native or CD4-bound states of gpl20.
References for the Fifth Series of Experiments
1. G. Alkhatib et al . , Science 272, 1955(1996); H.K. Deng et al . , Nature 381, 661(1996); B.J. Doranz et al., Cell85, 1149(1996); T. Dragic et al . , Nature 381, 667(1996) .
2. H. Chloe et al . , Cell 85, 1135(1996).
L. Wu et al., Nature 384, 179(1996)
4. A. Trkola et al . , Nature 384, 184(1996).
5. M. Thali et al . , J. Virol. 67, 3978(1993).
6. P. Kwong et al . , eubmitted; R. Wyatt et al . , submited; P. Kwong et al . , submitted.
7. Y. Li et al., J. Virol. 65, 3973(1991).
8. R. Wyatt et al . , J. Virol. 67, 4557(1993); R. Wyatt et al., J. Virol. 71, 8722(1997).
9. R. Wyatt et al . , J. Virol. 69, 5723(1995).
10. 293T cells were cotransfected with 20 μg of a plasmid expressing the wtΔ or mutant envelope glycoproteins and 2 μg of a plasmid expressing the HIV-1 Tat protein, using the calcium phosphate technique. Transfected cells were washed and metabolically labeled for 16 hours with 50 μCi/ml 35S-cysteine and 50 μCi/ml (35) S-methionine . Labeled cell supernatants were harvested, cleared by low- speed centrifugation (200 xg for 10 minutes at 4°C) and stored at 4°C until used in the binding assays. For measurement of the binding of sCD4 and antibodies to the wtΔ and mutant envelope glycoproteins, different dilutions of the envelope glycoprotein-containing supernatants were precipitated to ensure that binding occurred in the linear range of the assay. For CD4 binding, the envelope glycoprotein-containing supernatants were incubated for 30 minutes at room temperature with a concentration of sCD4 (Smith Kline Beecham) empirically determined to precipitate the wtΔ protein optimally. The envelope glycoprotein-sCD4 complexes were then precipitated with the CD4- specific antibody, OKT4 (Ortho) and Protein A- Sepharose (Pharmacia) . For binding of the 17b and F105 antibodies, the monoclonal antibodies were preincubated with Protein A-Sepharose prior to overnight incubation with envelope glycoprotein- containing sepernatants at 4°C. For Binding of the CG10 antibody, envelope glycoprotein-containing supernatants were incubated with 100 nM sCD4 at room temperature for 30 minutes prior to addition of a CGI0 -Protein G-Sepharose mixture and overnight incubation at 4°C. Immunoprecipitates were washed and run on 12.5% SDS-polyacrylamide gels, which were fixed, dried and analyzed by autoradiography . Binding was qualified by densitometry .
To measure CCR5 binding, envelope glycoprotein- containing supernatants were mixed with lOOnM xCD4 or phosphate-buffered saline (PBS) and incubated at room temperature for 30-60 minutes. L1.2-CCR5 cells (2xl07cells, LeukoSite, Inc. (3)) were pelleted, resuspended in 500 μl of envelope glycoprotein- containing supernatants, and rocked gently at 37°C for 1 hour. Cells were pelleted, washed twice in PBS and lysed by the addition of NP40 buffer (0.5 M NaCl, 10 mM Tris, pH 7.5, 0.5% NP40) . Lysates were cleared (20,000 xg at 4°C for 15 minutes) in a microdcentrifuge and the envelope glycoproteins were precipitated overnight at 4°C by a mixture of sera from HIV-1-infected individuals and Protein A- Sepharose. Sepharose pellets were washed in NP40 buffer, boiled in SDS-containing sample buffer and run on 12.5% SDS-polyacrylamide gels. Autoradiographed gels were quantitated using a densitometer .
11. L. Wu et al., J. Exp. Med. 185, 1681(1997]
12. M. Samson, O. Labbe, C. Mollereau, G. Vassart, M. Parmentier, Biochemistry 35, 3362(1996); C. Combadiere, S. Akuja, H. Tiffany, P. Murphy, J.
Leukocyte Biol. 60, 157(1996); C. Raport , J. Gosling, V. Schweickart, P. Gray, I. Charo, J. Biol. Chem. 271, 17161(1996).
13. M. Posner et al . , J. Immunol.. 146, 4325(1991).
14. N. Sullivan et al . , J. Virol., in the press.
15. R. Wyatt et al . , J. Virol. 66, 6997 (1992); J. P. Moore, et al . J. Virol. 67, 4785 (1993); Carillo, A. and Ratner, L. J. Virol. 70, 1301(1996); H.G. Morrison, F., Kirchhoff, R. Desrosiers, Virology 195, 167 (1993); F. Kirchhoff, H. Morrison, R. Desrosiers, Virology 213, 179(1995); J.P. Moore and J. Sodorski, J. Virol. 70, 1863(1996).
16. F. Cocchi et al . , Nature Med. 2, 1244(1996); P.D. Bieniasz et al . , EMBO J. 16, 2599(1997); Speck et al., J. Virol. 71, 7136(1997).
17. M. Farzan et al . , J. Virol. 72, 1160(1998); T. Dragic et al . , J. Virol. 72,279(1998); G. Rabut et al . , J . Virol . 72 , 3464 ( 1998 ) .
18. K. Martin et al . , Science 278, 1470(1997).
19. M.J. Endres et al . , Cell 87, 745(1996); J.D. Reeves and T.F. Schulz, J. Virol. 71, 1453(1997).
20. Y. Feng, CC. Broder, P. Kennedy, E. Berger; Science 272, 872 (1996) .
21. C Rizzuto, N. Hernandez and J. Sodroski, unpublished observations.
22. A. Cordonnier, L. Montagnier, A. Cordonnier, J. Virol. 67, 6253(1993); K. Fujita, J. Silver, K.
Peden, J. Virol. 66, 4445(1992).
23. CM. Carr and P. S. Kim, Cell 73, 623(1993); P. Bullough, F. Hughson, J. Skehl, D.C. Wiley, Nature 371, 37(1994); W. Weissenhorn et al . , EMBO J 15,
1507 (1996); C.-H. Chen, T.J. Matthews, C B. McDanal, D. P. Bolognesi, Greenber, J. Virol. 69, 3771 (1995); C Wild, T. Oas, C, McDanal, D. Bolognesi, T. Matthews, Proc. Natl. Acad. Sci. USA 89, 10537(1992); S. Jiang, K. Lin, N. Strick, A. R.
Neurath, Nature 365, 113(1993); S. Jian, K. Lin, N. Strick, A.R. Neurath, BBRC 195, 553(1993); D.C. Chan, D. Fass, J.M. Berger, P.S. Kim, Cell 89, 263(1997); W. Weissenhorn, A. Dessen, S.C. Harrison, J.J. Skehel, D.C. Wiley, Nature 387, 426(1997).
24. B. Lee and F. Richards, J. Mol. Biol. 55, 379(1971); S. Sheriff, W.A. Hendrickson, R.E. Stenkamp, L. Sieker, L. H. Jensen, Proc. Natl. Acad. Sci. USA 82, 1104 (1985) .
25. Myers et al . , Human Retroviruses and AIDS. A compilation of nucleic acid and amino acid sequences. Los Almos National Laboratory. Los Almos, NM, 1996.
26. A. Nicholls, K.A. Sharp, B. Honig, Proteins 11, 281(1991) .
Table l. Phenotypes of HIV-1 gpl20 mutants. The ability of the wtΔ and mutant glycoptroteins to bind CCR5 expressed on L1.2 cells was determined (10). The recognition of the wtΔ and mutant glycoporteins by sCD4 and monoclonal antibodies was determined (10) . All values reported are relatie to those seen for the wtΔ protein. Values represent the average of at least two independent expeiments and exhibitedless than 30% variation from the values shown.
Ligand Binding*
Figure imgf000211_0001
Figure imgf000212_0001
Figure imgf000213_0001
*The number of the mutant wtΔglycoproteins is based on the sequence of the prototypic HXBc2 gpl20 glycoprotein (24) , with 1 representing the initiator methionine. The wild-type YU2 gpl20 residue is listed, followed by the subsitituted residue. Amino acid abbreviations: A, Alanine; D, aspartic; E, glutamic acid; F, phenylalanine; G, glycine; h, histidine; I, isoleucine; K, lysone; L, leucine; M, methionine; N, Asparagine; P, proline; Q, glutamine; R, arginine; S, serine; T, threonine; V, valine; Y, tyrosine. The fractional solvent accessibilities associated with gpl20 residues in which changes specifically disrupted CCR5 binding are shown in prentheses. Fractional solvent accessibility was claculated as the ratio of solvent-accessible surface area for atoms of amino-acid residue X in the gpl20 core (without carbohydrate moieties) to the area obtained after reducing the structure to a Gl-X-Gly tripeptide (24) , values cited are for side-chain atoms except for glycine 441 where the value for all atoms is given.
+The binding of the wtΔ glycoprotein to L1.2-CCR5 cells was shown to be linearly related to the concentration of wtΔ protein in the transfected 293T cell supernatants, over the range of concentrations used in these experiments. The total amount of wtΔ and mutant glycoprotein present in the 293T cell supernatants was estimated by precipitation with an excess of a mixture of sera from HIV-1-infected individuals. The amount of wtΔ and mutant glycoprotein bound to the L1.2-CCR5 cells was determined as described (10) . The value for CCR5 binding was calculated using the following formula: BPUΩti mutant Prote n Total wtΔ protein CCR binding= Bound wtΔ protein X Total mutant protein
φThe recognition of the wtΔ and mutant glycoproteins by sCD4 and antibodies was determined by precipitation of radiolabeled envelope glycoproteins in transfected 293T cell supernatants as described (10) . In parallel, the labeled envelope glycoproteins were precipitated with an excess of a mixture of sera from HIV-1-infected individuals. The value for ligand binding was calculated using the following formula:
Ligand binding = Mutant Pr<? <?iniigand M£ΔE __2t≤inSerum mixture wtΔproteinligand X Mutant proteinsβrum nature
In the sCD4 and 17b columns, the values in bold indicate gpl20 residues that exhibit decreased solvent accessibility on the presence of the two-domain sCD4 or 17b Fab, respectively, in the ternary complex (6) . Changes in solvent accessibility were calculated using the MS program of Michael Connolly.

Claims

What is Claimed is :
1. A method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammal in need thereof a compound which contains certain functional groups that interact with HIV- gpl20 in a manner that disrupt two or more of the following interactions :
a . an alkyl group, R, aromatic or heteraromatic group, Het, that binds to the side chain isobutyl (isopropyl or methyl group) of gpl20 isoleucine (valine or alanine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl (isopropyl or methyl group) of gpl20 isoleucine (valine, or alanine) 371 and CD4 phenylalanine 43;
b. an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43, wherein Het is phenyl, Bn,
EtPh, or heteroarylalkyl
c. a group X that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43, wherein X is hydroxyalkyl, hydroxyaryl, alkylamide, or arylamide;
d. an aromatic group or heteroaromatic group, Het, that binds to the side chain indole group of gpl20 tryptophan 427 or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
e. an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
f . an aromatic group or heteroaromatic group, Het, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
g. an alkyl group, R, that binds to the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43, wherein R is alkyl, cycloalkyl, or haloalkyl;
h. an aromatic group of heteroaromatic group, Het, that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43; i . a group X that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of gpl20 asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
j . a group Y that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, dialkylammonium, arylammonium, arylalkylammonium , alkylguanidinium , piperidinium, pyrollidinium, or pyridinium;
k. an alkyl group, R, aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of gpl20 valine (or alanine) 430 or disrupts the hydrophobic interaction between the side chain isopropyl (or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59;
1. a group, Z, that binds to the side chain propylalcohol group of gpl20 threonine 123 or disrupts the side chain propylalcohol group of threonine 123 and the hydrogen bond interaction between the alpha carbonyl group of CD4 Arg 59, wherein Z is alkoxyalkyl, aryloxyalkyl, alkoxyaryl, haloalkyl, haloaryl, alkylamide, arylamide, alkylcarboxylate, arylcarboxylate, arylalkyl ester, dialky ester, or alkylarl ester.
2. A method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammalin need thereof a compound which contains certain functional groups that interact with HIV- gpl20 in a manner that disrupt two or more of the following interactions;
a. an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43;
b. an aromatic group of heteroaromatic group, Het, that binds to the alpha, beta or gamma carbons fo the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43;
c . an aromatic group or heteroaromatic group, Het, that binds to the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 and CD4 phenylalanine 43;
d. an aromatic group or heteroaromatic group, Het, that binds to the alpha carbonyl group of gp!20 asparagine (or arginine) 425 or disrupts the
CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
e. an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 tryptophan 427 or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
f . an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
g. an aromatic group or heteroaromatic group, Het, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
h. a group Y that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine
59, wherein Y is alkylammonium, dialkylammonium, arylammonium, arylalkylammonium , alkylguanidinium, piperidinium, pyrollidinium, or pyridinium;
an alkyl group, R, aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of ghl20 valine (or alanine) 430 or disrupts eh hydrophobic interaction between the side chain isopropyl (or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59;
j . a group, Z, that binds to the side chain propylalcohol group of gpl20 threonine 123 or disrupts the side chain propylalcohol group of threonine 123 and the hydrogen bond interaction between the alpha carbonyl group CD4 Arg 59; and/or
k. a group, X, that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 with the side chain amide group of CD4 glutamine 40;
1. a group, Z, that binds tot he alpha amino group of gpl20 glycine (alanine, or glutamic acid)
472 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (alanine, or glutamic acid) 473 with the side chain propionamide group of CD4 glutamine 40;
m. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the side chain propionamide group of CD4 glutamine 40; n. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.4 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the alpha carbonyl group of CD4 glutamine 40;
o. a group, X, that binds to the alpha carbonyl group of gpl20 methionine (or serine) 426 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 methionine (or serine) 426 and the side chain hydroxyl group of CD4 serine 42;
p. a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the alpha amino group of CD4 serine 42;
q. a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the side chain hydroxyl group of CD4 serine 42;
r. a group, X, that binds to the alpha amino group of gpl20 lysine 429 or dsrupts the hydrogen bond interaction between the alpha amino group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
s. a group, X, that binds to the alpha carbonyl group of gpl20 lysine or disrupts the hydrogen bond interactio between the alpha carbonyl group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
t. a group, X, that binds to the alpha amino group of gpl20 valine (or alanine) 430 or disrupts the hydrogen bond between the alpha amino group of gpl20 valine (or alanine) 430 and the side chain hydroxyl group of CD4 serine 42;
u. a group, X, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine 473 and the alpha amino group of CD4 serine 42;
v. a group, X or Y, that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the side chain carboxyl group of gpl20 aspartic acid 368 and the alpha amino group of CD4 leucine 44;
w. a group, X, that binds to the alpha amino group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid 368 and the alpha carbonyl group CD4 leucine 44;
x. an alkyl group, R, that binds to the isobutyl
(or isopropyl) group of gpl20 isoleucine (or valine) 271 or disurpts the hydrophobic interaction between the isobutyl group of gpl20 isoleucine (or valine) 271 and the side chain hydroxypropl group of CD4 threonine 45;
y. a group, X, that binds to the alpha amino group of gpl20 glycine 366 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 366 and the alpha carbonyl group of CD4 lysine 46;
z. a group, X, that binds to the alpha carbonyl group ofgpl20 glycine 366 or disrupts the hydrogen bond interaction between the alpha carbonyl group of gpl20 glycine 366 and the alpha amino group of CD4 lysine 46;
al . a group, X, that binds to the alpha amino group of gpl20 clycine 367 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 367 and the alpha carbonyl group of CD4 lysine 46;
bl . a group, Y, that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang., or disrupts the hydrogen bond interaction between the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or seine) 280 and the side chain butylammonium group of CD4 lysine 29;
cl . a group, Q, that binds to the alpha methylene
(or methine) group of gpl20 glycine (or valine) 459 at a distance of 3.1 ang., or disrupts the dipolar interaction between the alpha methylene (or methine) group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32, wherein Q is dialkylketone, alkylarylketone, or arylalkylketone;
dl . a group, Z, that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang., or disrupts the hydrogen bond interaction betweeen the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
el. a group, X, that binds to the side chain amide
(or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the alpha carbonyl group of CD4 glutamine 33;
fl. a group, X, that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the side chain amide group of CD4 glutamine 33;
gl . a group, X, that binds to the alpha amino group of gpl20 glycine (or valine) 459 or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (or valine) 459 with the side chain propionamido group of CD4 glutamine 33;
hi . a group, X, that binds to the alpha carbonyl group of gpl20 serine (or alanine) 365 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 serine (or alanine) 365 with the side chain amide of CD4 asparagine 52; and/or
il. a group, X, that binds to the side chain isopropylalcohol group of gpl20 threonine 123 at a distance of 3.8 ang., or disrupts the hydrogen bond between the side chain isopropyl alcohol group of gpl20 threonine 123 with the side chain ethyl alcohol group of CD4 serine
60 .
3. A Method of inhibiting the interaction of HIV-gpl20 with leukocyte CD4 which comprises administering to a mammal in need thereof a compound which contains certain functional groups that interact with HIV- gpl20 in a manner that disrupts two or more of the following interactions:
a. an aromatic group or heteroaromatic group, Het, that binds to the side chain carboxylate group of gpl20 aspartic acid 368 or disrupts the dipolar interaction between the side chain carboxylate group of gpl20 aspartic acid 368 and the side chain phenyl group of CD4 phenylalanine 43;
b. an aromatic group or heteroaromatic group, Het, that binds to the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 or disrupts the hydrophobic interaction between the alpha, beta or gamma carbons of the side chain propionate of gpl20 glutamic acid 370 and the side chain phenyl group of CD4 phenylalanine 43;
c. an aromatic group or heteroaromatic group, Het, that binds to the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 or disrupts the hydrophobic interaction between the side chain isobutyl (or isopropyl group) of gpl20 isoleucine (or valine) 371 and CD4 phenylalanine 43;
d. an aromatic group or heteroaromatic group, Het, that binds to the alpha carbonyl group of gpl20 asparagine (or arginine) 425 or disrupts the CHO hydrogen bond interaction between the alpha carbonyl group of asparagine (or arginine) 425 and the side chain phenyl group of CD4 phenylalanine 43;
e. an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 tryptophan 427 or disrupts the hydrophobic interaction between the side chain indole group of gpl20 tryptophan 427 and the side chain phenyl group of CD4 phenylalanine 43;
f. an aromatic group or heteroaromatic group, Het, that binds to the alpha methylene group of gpl20 glycine 473 or disrupts the hydrophobic interaction between the alpha methylene group of gpl20 glycine 473 and the side chain phenyl group of CD4 phenylalanine 43;
g. an aromatic group or heteroaromatic group, Het, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the dipolar interaction between the alpha carbonyl group of gpl20 glycine 473 and the side chain phenyl group of
CD4 phenylalanine 43;
h. a group, Y, that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the ionic interaction between the side chain carboxyl group of aspartic acid 368 and the side chain guanidinium group of CD4 arginine 59, wherein Y is alkylammonium, d i a 1 k y 1 a mm o n i u , a r y 1 a mm o n i u m , arylalkylammonium , alkylguanidinium , piperidinium, pyrollidinium, or pyridinium. i. an alkyl group, R, aromatic or heteroaromatic group, Het, that binds to the side chain isopropyl (or methyl) group of gpl20 valine (or alanine) 430 or disrupts the hydrophobic interaction between the side chain isopropyl
(or methyl) group of valine (or alanine) 430 and the side chain guanidinium group of CD4 arginine 59;
j . a group, Z, that binds to the side chain propylalcohol group of gpl20 threonine 123 or disrupts the side chain propylalcohol group of threonine 123 and the hydrogen bond interaction between the alpha carbonyl group of CD4 Arg 59; and/or
k. a group, X, that binds to the alpha carbonyl group of gpl20 glycine (alanine, or glutamic acid) 472 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine
(alanine, or glutamic acid) 472 with the side chain amide group of CD4 glutamine 40;
1. a group, Z, that binds tot healpha amino group of gpl20 glycine (alanine, or glutamic acid)
472 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (alanine, or glutamic acid) 473 with the side chain propionamide group of CD4 glutamine 40;
m. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.6 ang., or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the side chain propionamide group of CD4 glut amine 40 ;
n. a group, Z, that binds to the alpha amino group of gpl20 aspartic acid (or asparagine) 474 at a distance of 3.4 ang., or disrupts the hydrogen bond betweenthe alpha amino group of gpl20 aspartic acid (or asparagine) 474 with the alpha carbonyl group of CD4 glutamine 40;
o. a group, X, that binds to the alpha carbonyl group of gpl20 methionine (or serine) 426 or disrupts the hydrogen bond betweenthe alpha carbonyl group of gpl20 methionine (or serine) 426 and the side chain hydroxyl group of CD4 serine 42;
p. a group, X, that binds to the alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the alpha amino group of CD4 serine 42;
q. a group, X, that binds tot he alpha carbonyl group of gpl20 tryptophan 427 or disrupts the hydrogen bond interaction between the alpha carbonyl of gpl20 tryptophan 427 and the side chain hydroxyl group of CD4 serine 42;
r. a group, X, that binds to the alpha amino group of gpl20 lysine 429 or disrupts the hydrogen bond interaction betweeen the alpha amino group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
s. a group, X, that binds to the alpha carbonyl group of gpl20 lysin or disrupts the hydrogen bond interaction between the alpha carbonyl group of gpl20 lysine (threonine or asparagine) 429 and the side chain hydroxyl group of CD4 serine 42;
t. a group, X, that binds to the alpha amino group of gpl20 valine (or alanine) 430 or disrupts the hydrogen bond between the alpha amimo group of gpl20 valine (or alanine) 430 and the side chain hydroxyl group of CD4 serine 42;
u. a group, X, that binds to the alpha carbonyl group of gpl20 glycine 473 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 glycine 473 and the alpha amino group of CD4 serine 42;
v. a group, X or Y, that binds to the side chain carboxyl group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the side chain carboxyl group of gpl20 aspartic acid 368 and the alpha amino group of CD4 leucine 44;
w. a group, X that binds to the alpha amino group of gpl20 aspartic acid 368 or disrupts the hydrogen bond between the alpha amino group of gpl20 aspartic acid 368 and the alpha carbonyl group of CD4 leucine 44;
x. an alkyl group R that binds to the isobutl (or isopropyl) group of gpl20 isoleucine (or valine) 271 or disrupts the hydrophobic interaction between the isobutyl group of gpl20 isoleucine (or valine) 271 and the side chain hydroxypropyl group of CD4 threonine 45;
y. a group, X, that binds to the alpha amino group of gpl20 glycine 366 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 366 and the alpha carbonyl group of CD4 lysine 46;
z. a group, X, that binds to the alpha carbonyl group of gpl20 glycine 366 or disrupts the hydrogen bond interaction betweeen the alpha carbonyl group of gpl20 glycine 366 and the alpha amino group of CD4 lysine 46;
al . a group, X, that binds to the alpha amino group of gpl20 glycine 367 or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine 367 and the alpha carbonyl group of CD4 lysine 46;
bl . a group, Y, that binds to the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 at a distance of 3.3 ang. , or disrupts the hydrogen bond interaction between the side chain acetamide (or methyl alcohol) group of gpl20 asparagine (or serine) 280 and the side chain butylammonium group of CD4 lysine 29;
cl . a group, Q, that binds to the alpha methylene
(or methine) group of gpl20 glycine (or valine)
459 at a distance of 3.1 ang., or disrupts the dipolar interaction between the alpha methylene
(or methine) group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
dl . a group, Z, that binds to the alpha amino group of gpl20 glycine (or valine) 459 at a distance of 3.4 ang or disrupts the hydrogen bond interaction between the alpha amino group of gpl20 glycine (or valine) 459 and the alpha carbonyl group of CD4 asparagine 32;
el . a group, X, that binds to the side chain amide
(or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the alpha carbonyl group of CD4 glutamine 33;
fl. a group, X, that binds to the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 or disrupts the hydrogen bond between the side chain amide (or hydroxyl) group of gpl20 asparagine (or serine) 280 with the side chain amide group of CD4 glutamine 33;
gl . a group, X, that binds to the alpha amino group of gpl20 glycine (or valine) 459 or disrupts the hydrogen bond between the alpha amino group of gpl20 glycine (or valine) 459 with the side chain propionamido group of CD4 glutamine 33;
hi . a group, X, that binds to the alpha carbonyl group of gpl20 serine (or alanine) 365 or disrupts the hydrogen bond between the alpha carbonyl group of gpl20 serine (or alanine) 365 with the side chain amide of CD4 asparagine 52; and/or
il. a group, X, that binds to the side chain isopropylalcohol group of gpl20 threonine 123 at a distance of 3.8 ang., or disrupts the hydrogen bond between the side chain isopropyl alcohol group of gpl20 threonine 123 with the side chain ethyl alcohol group of CD4 serine
50 .
jl. an alkyl group, R, or an aromatic or heteroaromatic group that binds to the side chain indole group of gpl20 tryptophan (or phenylalanine) 112 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
kl . an alkyl group, R, or an aromatic or heteroaromatic group that binds to the side chain phenyl group of gpl20 phenylalanine 382 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
ll . an aromatic or heteroaromatic group that binds to the side chain phenolic group of gpl20 tyrosine 384 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
ml. an alkyl group, R, that binds to the side chain alkyl group of gpl20 valine (isoleucine, or glutamine) 255 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
nl . a group, X, that binds to the side chain hydroxyl group of gpl20 threonine 257 and/or disurpts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
ol . a group, Y, that binds to the side chain carboxyl group of gpl20 glutamic acid 370 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43; pl . an alkyl group, R, that binds to the side chain isobutyl group of gpl20 isoleucine 424 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43;
ql . an aromatic or heteroaromatic group that binds to the side chain indole group of gpl20 tryptophan 427 and/or disrupts the aforementioned interactions of gpl20 with CD4 phenylalanine 43; and/or
rl . an aromatic or heteroaromatic group that binds to the side chain phenolic group of gpl20 tyrosine 435 and/or disrupts the afroementioned inteactins of gp!20 with CD4 phenylalanine 43.
PCT/US1998/023906 1997-11-10 1998-11-10 COMPOUNDS INHIBITING CD4-gp120 INTERACTION AND USES THEREOF WO1999024065A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU14545/99A AU1454599A (en) 1997-11-10 1998-11-10 Compounds inhibiting cd4-gp120 interaction and uses thereof

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US96770897A 1997-11-10 1997-11-10
US08/967,708 1997-11-10
US10076498A 1998-06-18 1998-06-18
US09/100,764 1998-06-18

Publications (1)

Publication Number Publication Date
WO1999024065A1 true WO1999024065A1 (en) 1999-05-20

Family

ID=26797520

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/023906 WO1999024065A1 (en) 1997-11-10 1998-11-10 COMPOUNDS INHIBITING CD4-gp120 INTERACTION AND USES THEREOF

Country Status (2)

Country Link
AU (1) AU1454599A (en)
WO (1) WO1999024065A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6511826B2 (en) 1995-06-06 2003-01-28 Human Genome Sciences, Inc. Polynucleotides encoding human G-protein chemokine receptor (CCR5) HDGNR10
US6743594B1 (en) 1995-06-06 2004-06-01 Human Genome Sciences, Inc. Methods of screening using human G-protein chemokine receptor HDGNR10 (CCR5)
US7807671B2 (en) 2006-04-25 2010-10-05 Bristol-Myers Squibb Company Diketo-piperazine and piperidine derivatives as antiviral agents
US7829711B2 (en) 2004-11-09 2010-11-09 Bristol-Myers Squibb Company Crystalline materials of 1-(4-benzoyl-piperazin-1-yl)-2-[4-methoxy-7-(3-methyl-[1,2,4]triazol-1-yl)-1H-pyrrolo[2,3-C]pyridine-3-yl]-ethane-1,2-dione
US7851476B2 (en) 2005-12-14 2010-12-14 Bristol-Myers Squibb Company Crystalline forms of 1-benzoyl-4-[2-[4-methoxy-7-(3-methyl-1H-1,2,4-triazol-1-YL)-1-[(phosphonooxy)methyl]-1H-pyrrolo[2,3-C]pyridin-3-YL]-1,2-dioxoethyl]-piperazine
WO2011024175A1 (en) 2009-08-28 2011-03-03 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. Macrocyclic compounds, compositions comprising them and methods for preventing or treating hiv infection
US8637036B2 (en) 2009-09-25 2014-01-28 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Neutralizing antibodies to HIV-1 and their use
US9341639B2 (en) 2013-07-26 2016-05-17 Industrial Technology Research Institute Apparatus for microfluid detection
US9403763B2 (en) 2011-12-14 2016-08-02 Dana-Farber Cancer Institute, Inc. CD4-mimetic inhibitors of HIV-1 entry and methods of use thereof
US9695230B2 (en) 2011-12-08 2017-07-04 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Broadly neutralizing HIV-1 VRC07 antibodies that bind to the CD4-binding site of the envelope protein
US9776963B2 (en) 2008-11-10 2017-10-03 The Trustees Of The University Of Pennsylvania Small molecule CD4 mimetics and uses thereof
US9975848B2 (en) 2014-08-13 2018-05-22 The Trustees Of The University Of Pennsylvania Inhibitors of HIV-1 entry and methods of use thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5109123A (en) * 1988-06-14 1992-04-28 Dana Farber Cancer Institute Alteration of ability of soluble CD4 fragments to bind HIV
US5614612A (en) * 1990-03-09 1997-03-25 Haigwood; Nancy L. Purified gp120 compositions retaining natural conformation
US5851529A (en) * 1988-03-21 1998-12-22 Guber; Harry E. Recombinant retroviruses

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5851529A (en) * 1988-03-21 1998-12-22 Guber; Harry E. Recombinant retroviruses
US5109123A (en) * 1988-06-14 1992-04-28 Dana Farber Cancer Institute Alteration of ability of soluble CD4 fragments to bind HIV
US5614612A (en) * 1990-03-09 1997-03-25 Haigwood; Nancy L. Purified gp120 compositions retaining natural conformation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHEN S., ET AL.: "DESIGN AND SYNTHESIS OF A CD4 BETA-TURN MIMETIC THAT INHIBITS HUMAN IMMUNODEFICIENCY VIRUS ENVELOPE GLYCOPROTEIN GP120 BINDING AND INFECTION OF HUMAN LYMPHOCYTES.", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES, NATIONAL ACADEMY OF SCIENCES, US, vol. 89., 1 July 1992 (1992-07-01), US, pages 5872 - 5876., XP002916732, ISSN: 0027-8424, DOI: 10.1073/pnas.89.13.5872 *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6511826B2 (en) 1995-06-06 2003-01-28 Human Genome Sciences, Inc. Polynucleotides encoding human G-protein chemokine receptor (CCR5) HDGNR10
US6743594B1 (en) 1995-06-06 2004-06-01 Human Genome Sciences, Inc. Methods of screening using human G-protein chemokine receptor HDGNR10 (CCR5)
US6759519B2 (en) 1995-06-06 2004-07-06 Human Genome Sciences, Inc. Antibodies to human G-protein chemokine receptor HDGNR10 (CCR5receptor)
US6800729B2 (en) 1995-06-06 2004-10-05 Human Genome Sciences, Inc. Human G-Protein chemokine receptor HDGNR10 (CCR5 receptor)
US7829711B2 (en) 2004-11-09 2010-11-09 Bristol-Myers Squibb Company Crystalline materials of 1-(4-benzoyl-piperazin-1-yl)-2-[4-methoxy-7-(3-methyl-[1,2,4]triazol-1-yl)-1H-pyrrolo[2,3-C]pyridine-3-yl]-ethane-1,2-dione
US7851476B2 (en) 2005-12-14 2010-12-14 Bristol-Myers Squibb Company Crystalline forms of 1-benzoyl-4-[2-[4-methoxy-7-(3-methyl-1H-1,2,4-triazol-1-YL)-1-[(phosphonooxy)methyl]-1H-pyrrolo[2,3-C]pyridin-3-YL]-1,2-dioxoethyl]-piperazine
US7807671B2 (en) 2006-04-25 2010-10-05 Bristol-Myers Squibb Company Diketo-piperazine and piperidine derivatives as antiviral agents
US7807676B2 (en) 2006-04-25 2010-10-05 Bristol-Myers Squibb Company Diketo-Piperazine and Piperidine derivatives as antiviral agents
US9776963B2 (en) 2008-11-10 2017-10-03 The Trustees Of The University Of Pennsylvania Small molecule CD4 mimetics and uses thereof
WO2011024175A1 (en) 2009-08-28 2011-03-03 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. Macrocyclic compounds, compositions comprising them and methods for preventing or treating hiv infection
US8637036B2 (en) 2009-09-25 2014-01-28 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Neutralizing antibodies to HIV-1 and their use
US9175070B2 (en) 2009-09-25 2015-11-03 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Neutralizing antibodies to HIV-1 and their use
US9738703B2 (en) 2009-09-25 2017-08-22 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Neutralizing antibodies to HIV-1 and their use
US10035845B2 (en) 2009-09-25 2018-07-31 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Neutralizing antibodies to HIV-1 and their use
US9695230B2 (en) 2011-12-08 2017-07-04 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Broadly neutralizing HIV-1 VRC07 antibodies that bind to the CD4-binding site of the envelope protein
US10035844B2 (en) 2011-12-08 2018-07-31 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Broadly neutralizing HIV-1 VRC07 antibodies that bind to the CD4-binding site of the envelope protein
US10815295B2 (en) 2011-12-08 2020-10-27 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Broadly neutralizing HIV-1 antibodies that bind to the CD4-binding site of the envelope protein
US9403763B2 (en) 2011-12-14 2016-08-02 Dana-Farber Cancer Institute, Inc. CD4-mimetic inhibitors of HIV-1 entry and methods of use thereof
US9341639B2 (en) 2013-07-26 2016-05-17 Industrial Technology Research Institute Apparatus for microfluid detection
US9975848B2 (en) 2014-08-13 2018-05-22 The Trustees Of The University Of Pennsylvania Inhibitors of HIV-1 entry and methods of use thereof

Also Published As

Publication number Publication date
AU1454599A (en) 1999-05-31

Similar Documents

Publication Publication Date Title
Kwong et al. Structures of HIV-1 gp120 envelope glycoproteins from laboratory-adapted and primary isolates
Kong et al. Antigenicity and immunogenicity in HIV-1 antibody-based vaccine design
US8268323B2 (en) Conformationally stabilized HIV envelope immunogens
EP3189067A1 (en) Recombinant hiv-1 envelope proteins and their use
EP2873423B1 (en) Soluble hiv-1 envelope glycoprotein trimers
AU2017259275B2 (en) Compositions and methods related to HIV-1 immunogens
US20140348865A1 (en) Immunogens based on an hiv-1 v1v2 site-of-vulnerability
Liu et al. Structure of the HIV-1 gp41 membrane-proximal ectodomain region in a putative prefusion conformation
US7048929B1 (en) Stabilized primate lentivirus envelope glycoproteins
WO1999024065A1 (en) COMPOUNDS INHIBITING CD4-gp120 INTERACTION AND USES THEREOF
Daniels et al. Antibody responses to the HIV-1 envelope high mannose patch
Prabakaran et al. Structure and function of the HIV envelope glycoprotein as entry mediator, vaccine immunogen, and target for inhibitors
US9775895B2 (en) HIV therapeutics and methods of making and using same
Wu et al. Structural basis of diverse peptide accommodation by the rhesus macaque MHC class I molecule Mamu-B* 17: insights into immune protection from simian immunodeficiency virus
WO1999024553A9 (en) X-ray crystal comprising hiv-1 gp120
US20170326229A1 (en) Swarm immunization with 54 envelopes from ch505
WO2004053100A2 (en) Immunogenic mutant human immunodeficiency virus gp120 polypeptides, and methods of using same
Cheng Elicitation of antibody responses against the HIV-1 gp41 Membrane Proximal External Region (MPER)
Moyo Role of envelope compactness and glycosylation in HIV-1 resistance to neutralising antibody responses
Ghiara Structural mapping of the V3 loop neutralization site of HIV-1: Crystallographic analysis of Fab-gp120 peptide complexes
US20020173446A1 (en) Compounds which bind to the central cavity between HIV-1 gp120 and CD4 and uses thereof
Guenaga Epitope Scaffolds And The HIV-1 gp41 2F5 Neutralization Determinant
US20160122395A1 (en) Immunogenic compositions and a process for producing same
Klein HIV-1 gp120: A novel, large-scale purification technique and thermodynamic characterization of the binding of two small-molecule inhibitors

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
NENP Non-entry into the national phase

Ref country code: KR

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA