17/01/2024
Join us this time next week for our first CeADAR Tech Talk of 2024, 'GPT-like Pre-Training on Unlabelled System Logs for Malware Detection' with guest speaker Dmitrijs Trizna of Microsoft
In recent years, self-supervised language modeling techniques, such as those used in GPT-like language models, have shown great success in natural language processing tasks, without requiring supervision from domain experts to learn language semantics. In this talk, we explore the transferability of these techniques to system logs and share pre-training methodology of a Transformer model on unlabeled logs for malware detection.
Infrastructures generate vast amounts of system logs suitable for cybersecurity needs, but only a fraction of these logs are labeled and annotated for specific events or anomalies. Our experiments demonstrate that pre-training the model on unlabeled system logs leads to improved performance on the task of malware detection, compared to training on labeled data alone. Moreover, we show that the pre-trained model learns patterns that are similar to what a human engineer would consider relevant in detecting malware.
These findings highlight the potential of pre-training GPT-like models on system logs for cybersecurity applications, and demonstrate the benefits of self-supervised learning approaches in domains where labeled data is scarce. Overall, our work contributes to the growing body of literature on applying language modeling techniques beyond natural language processing and opens up new avenues for research in the field of cybersecurity.
Register using the link below:
https://ucd-ie.zoom.us/webinar/register/WN_WZ-MAtOJQii8cs3Bmi0fUw #/registration
06/12/2023
A new article in No. 93 issue of the magazine " " is now available: Development of a low-noise for drones toward an era of drones flying around cities.
https://global.jaxa.jp/activity/pr/jaxas/no093/04.html
30/11/2023
Artificial intelligence pioneer leaves Google and warns about technology's future
Geoffrey Hinton, a trailblazer in AI, has joined the growing list of experts sharing their concerns about its rapid advancement.
06/08/2023
IBM open sources the largest NASA AI model on Hugging Face | IBM Research Blog
It aims to widen access to NASA satellite data and accelerate climate-related discoveries.
13/07/2023
IEEE GRSS-USC MHI 2023 Remote Sensing Summer School
Date: July 13-15, 2023 (Thursday to Saturday)
Venue:
University of Southern California
Hughes Aircraft Electrical Engineering Center (EEB)
3740 McClintock Ave, Los Angeles, CA 90089
Contact: [email protected], [email protected]
IEEE GRSS-USC MHI 2023 Remote Sensing Summer School || IEEE IGARSS 2023 || Pasadena, California, USA || 16 - 21 July, 2023
2023 IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2023), Pasadena, California, USA, 16 - 21 July, 2023
05/07/2023
Geospatial Python Libraries
70 Geospatial Python Libraries
Python has emerged as a dominant language in the field of Geographic Information Systems (GIS) and remote sensing due to its versatility…
27/06/2023
https://summit2023.landcarbonlab.org/livestream/?utm_medium=event&utm_source=calendar&utm_campaign=lclsummit2023
Livestream – Land & Carbon Lab
Land & Carbon Lab's2023 SummitMonitoring Land, Mobilizing Action Livestreaming 27 – 29 June Presented by: Land & Carbon Lab's2023 SummitMonitoring Land, Mobilizing Action Livestreaming 27 – 29 June Presented by: Land & Carbon Lab’s Summit is now live! You can access livestreamed sessio...