Thursday, February 16, 2023

Pretrained Models with Adversarial Training for Online Sexism Detection @ SemEval 2023

         Abstract 

Adversarial training can provide neural networks with significantly improved resistance to adversarial attacks, thus improving model robustness. However, a major drawback of many existing adversarial training workflows is the computational cost and extra processing time when using data augmentation techniques. This post explores the application of embedding perturbations via the fast gradient method (FGM) when finetuning large language models (LLMs) to short text classification tasks. This adversarial training approach has been evaluated as part of the first sub-task of SemEval 2023-Task 10, focused on explainable detection of sexism in social networks (EDOS). Empirical results show that adversarially finetuned models with FGM had on average a 25% longer training time and 0.2% higher F1 than their respective baselines. 

Tuesday, January 24, 2023

The string similarity problem

For two strings A and B (in the ASCII [a-z] range), we define the similarity of the strings to be the length of the longest prefix common to both strings. For example, the similarity of strings "abc" and "abd" is 2, while the similarity of strings "aaa" and "aaab" is 3.

The reader is asked to calculate the sum of similarities of a string S with each of its suffixes. Reference (https://www.hackerrank.com/challenges/string-similarity/problem)