Scientific paper: The arrangement of the amino acids in proteins
Background and aim
Frederick Sanger examined the fundamental question of how amino acids are arranged within proteins and why that arrangement matters. He set out to show that proteins are not random mixtures of amino acids but distinct chemical entities with a defined sequence of residues. Insulin, a small and biologically important protein, served as the model system that made the general conclusions accessible and experimentally tractable.
Analytical strategy
A central tactic was to identify the ends and internal segments of the peptide chain by selective chemical labeling and controlled fragmentation. The technique for labeling the N-terminal amino acid with 1-fluoro-2,4-dinitrobenzene (the Sanger reagent) allowed unambiguous identification of the terminal residue after hydrolysis. Complementary approaches used specific chemical and enzymatic cleavage to break the protein into overlapping peptide fragments, which were then separated and characterized by chromatography and other analytic techniques available at the time.
Reconstructing sequence from fragments
Overlapping fragments provided the critical information needed to order residues unambiguously. By comparing the sequences of different fragments that shared common subsequences, the positions of amino acids in the original polypeptide chain could be deduced. The combination of terminal labeling and systematic fragmentation allowed reconstruction of a complete primary structure from its constituent parts, even when the intact protein was relatively large for the experimental methods then available.
Insulin as the proof-of-principle
Insulin was resolved into two polypeptide chains linked by disulfide bridges, and the sequences of both chains were determined. The demonstration that insulin has a definite and reproducible sequence established the principle that proteins have precise primary structures. The presence of interchain disulfide bonds was shown to be essential to the protein's covalent architecture, and the chain-by-chain sequencing clarified how cysteine residues form specific cross-links that stabilize the overall molecule.
Conceptual conclusions
Proteins were established as defined chemical entities whose biological properties depend on the linear order of amino acids. The idea that a protein's specific amino acid sequence encodes its chemical identity implied a direct molecular link between genetic information and biochemical function. The methods emphasized that careful chemical degradation and analysis could reveal the primary structure of proteins and that sequence differences could underlie functional diversity among proteins.
Impact and perspective
The methodological advances and the clear demonstration using insulin shifted the study of proteins from compositional descriptions to determinations of exact sequences. This change fostered the development of broader sequencing strategies and set expectations about the relationship between sequence, structure, and function. The approach also highlighted practical challenges, larger proteins require more elaborate fragmentation and mapping strategies, and chemical modifications or covalent cross-links complicate analysis, pointing the way to evolving biochemical tools and concepts for decades of work that followed.
Frederick Sanger examined the fundamental question of how amino acids are arranged within proteins and why that arrangement matters. He set out to show that proteins are not random mixtures of amino acids but distinct chemical entities with a defined sequence of residues. Insulin, a small and biologically important protein, served as the model system that made the general conclusions accessible and experimentally tractable.
Analytical strategy
A central tactic was to identify the ends and internal segments of the peptide chain by selective chemical labeling and controlled fragmentation. The technique for labeling the N-terminal amino acid with 1-fluoro-2,4-dinitrobenzene (the Sanger reagent) allowed unambiguous identification of the terminal residue after hydrolysis. Complementary approaches used specific chemical and enzymatic cleavage to break the protein into overlapping peptide fragments, which were then separated and characterized by chromatography and other analytic techniques available at the time.
Reconstructing sequence from fragments
Overlapping fragments provided the critical information needed to order residues unambiguously. By comparing the sequences of different fragments that shared common subsequences, the positions of amino acids in the original polypeptide chain could be deduced. The combination of terminal labeling and systematic fragmentation allowed reconstruction of a complete primary structure from its constituent parts, even when the intact protein was relatively large for the experimental methods then available.
Insulin as the proof-of-principle
Insulin was resolved into two polypeptide chains linked by disulfide bridges, and the sequences of both chains were determined. The demonstration that insulin has a definite and reproducible sequence established the principle that proteins have precise primary structures. The presence of interchain disulfide bonds was shown to be essential to the protein's covalent architecture, and the chain-by-chain sequencing clarified how cysteine residues form specific cross-links that stabilize the overall molecule.
Conceptual conclusions
Proteins were established as defined chemical entities whose biological properties depend on the linear order of amino acids. The idea that a protein's specific amino acid sequence encodes its chemical identity implied a direct molecular link between genetic information and biochemical function. The methods emphasized that careful chemical degradation and analysis could reveal the primary structure of proteins and that sequence differences could underlie functional diversity among proteins.
Impact and perspective
The methodological advances and the clear demonstration using insulin shifted the study of proteins from compositional descriptions to determinations of exact sequences. This change fostered the development of broader sequencing strategies and set expectations about the relationship between sequence, structure, and function. The approach also highlighted practical challenges, larger proteins require more elaborate fragmentation and mapping strategies, and chemical modifications or covalent cross-links complicate analysis, pointing the way to evolving biochemical tools and concepts for decades of work that followed.
The arrangement of the amino acids in proteins
The paper discusses the importance of studying the arrangement of amino acids within a protein and explains how the structure of proteins was first discovered using insulin as a model.
- Publication Year: 1957
- Type: Scientific paper
- Genre: Scientific
- Language: English
- Awards: Nobel Prize in Chemistry 1958
- View all works by Frederick Sanger on Amazon
Author: Frederick Sanger

More about Frederick Sanger
- Occup.: Scientist
- From: United Kingdom
- Other works:
- The insulin molecule (1960 Scientific paper)
- Sequences, segments, structures and interactions of proteins and nucleic acids (1969 Scientific paper)