Application and Optimization of Prompt Engineering Techniques for Code Generation in Large Language Models
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Large Language Models have demonstrated remarkable capabilities across various domains, particularly in code generation and task-oriented reasoning. However, their accuracy and reliability in generating correct solutions remain a challenge due to the lack of task-specific prior knowledge and the limitations of existing prompt engineering techniques. Current state-of-the-art approaches, such as PAL, rely on manually crafted prompts and examples but often produce suboptimal results. Additionally, while numerous prompt engineering techniques have been developed to improve performance, selecting the most effective technique for a given task remains difficult since different queries exhibit varying levels of complexity.
This work presents an integrated approach to enhance the application and optimization of prompt engineering for code generation. First, it introduces TITAN, a novel framework that refines language model reasoning and task execution through step-back and chain of thought prompting. TITAN eliminates the need for extensive manual task-specific instructions by leveraging analytical and code-generation capabilities, achieving state-of-the-art zero-shot performance in multiple tasks. Second, it proposes PET-Select, a prompt engineering agnostic model that classifies queries based on code complexity and dynamically selects the most suitable prompt engineering technique using contrastive learning. This approach enables Pet-Select to optimize prompt selection, leading to improved accuracy and significant reductions in token usage.
Comprehensive evaluations across diverse benchmarks, including HumanEval, MBPP, and APPS, demonstrate the effectiveness of TITAN and Pet-Select. TITAN achieves up to 7.6 percent improvement over existing zero-shot methods, while Pet-Select enhances pass@1 accuracy by up to 1.9 percent and reduces token consumption by 49.9 percent. This work represents a significant advancement in optimizing prompt engineering for code generation in large language models, offering a robust and automated solution for improving performance in complex and diverse programming tasks.