28/04/2024
1. TOPICS OF INTEREST
The workshop focuses on bridging the gap between the theories and practice of testing and evaluation of LLMs and their applications. The topics cover all aspects of testing and evaluation of LLMs, which include, but are not limited to, the following:
1) Methodology: The testing and evaluation of LLMs may take place in different contexts, for example, as a part of the training, fine tuning, and application development. The methodological aspects of to perform testing and evaluation of LLM in such contexts include:
processes of testing and evaluation,
quality assurance methodology for LLMs, such as quality attributes and performance metrics of LLMs,
benchmark construction, validation, and application methodology,
integration of testing and evaluation of LLMs with machine learning research methodology such as machine learning development and operations (MLDevOps), AI application development methodology and software engineering methodology,
2) Technology: Techniques and methods used in various activities in the testing and evaluation of LLMs, and support the methodology, which include
techniques for test data generation, selection, cleaning, labelling, balancing, etc. for testing and evaluation of LLMs
techniques for statistical analysis, visualisation, and comparison of test results,
techniques for test scenario identification, formulation, representation, combination, and coverage,
techniques for test adequacy and requirements definition and measurement,
techniques for testing and evaluation of various applications of LLMs, such as in program code generation, in various other software engineering tasks, in audio, video and text generations, etc.
techniques for testing and evaluation of LLM algorithms, such as ablation studies
techniques for testing and evaluation of LLM capabilities on various text and multi-modal language processing tasks
techniques for testing and evaluation of LLM on various specific aspects, such as robustness, bias, hallucination, safety, privacy, and other ethical issues,
3) Tools and environments: Issues in the development, operation, maintenance and evolution of testing and evaluation tools, platforms, library code, infrastructures and environments that enables the testing and evaluation of LLMs, such as feature stores, open-source platforms, etc.
28/04/2024
28/04/2024
28/04/2024
28/02/2024
23/02/2024