Problem C
Harald Krukmakare and the Cursed LLM
Languages
en
sv
Kalle loves Harald Krukmakare. One day, when he was away visiting his gramps (grandpa), his little sister sneaked into his room and started scribbling in his Harald Krukmakare collection. Why did she do this? No one knows. We do know that when Kalle got home he was devastated. Many words were no longer legible and he could no longer read his favourite books.
Then Kalle had a brilliant idea! He could let an LLM fill in the words for him, because that is exactly what they are good at: predicting the next word in a sequence. But Kalle’s finances are not great at the moment and thus he does not want to waste any money on LLM tokens and buying a computer to run it locally is out of the question.
Fortunately, Kalle’s family owns a huge library of old fantasy books at home. The stories are full of wizards, dragons and magic, and their language is similar enough to Harald Krukmakare. Kalle lets you study this library first, so that you can learn how such stories are usually written before you attempt to repair his beloved books. Can you help Kalle find a cost-effective way of filling in the incomplete sentences?
There is one catch, though. Kalle was too lazy to write down the entire book with all the missing words to send to you. Instead, he has written down the five words prior to each missing word. He figured this would be enough for you to get a good enough result for him to read the books again.
Since Kalle is a bit impatient he will only ask your LLM for words that aren’t too "simple" (for example "the") because he can just fill those out himself.
For this task we have given you training data and it’s not allowed to find your own training data on the internet.
Input
Download the file with training and test data. This can be found at the bottom under “attachments”. You will receive a zip file containing:
-
train.txt - The fantasy book collection Kalle has at home in text on one line.
-
test.txt - The incomplete sentences with five words prior to the missing word.
-
baseline.py - A baseline solution to the problem.
The train.txt file contains one row with the entire text of the fantasy book collection.
The test.txt file contains multiple rows, each with five words prior to a missing word.
Output
For each row in test.txt, output 6 lines each containing a word: your predictions for what the missing word could be, followed by an empty line.
Scoring
Your solution is evaluated based on if one of your six predicted words matches the actual missing word for a given test case.
If $S$ is the percentage of test cases where one of your predicted words matches the actual missing word, then your score is calculated as:
\[ \text{Score} = \max \left(0, \min \left(100, \sqrt{\frac{S-1.2}{2.5-1.2}} \times 100\right)\right) \]At the end of the competition, all solutions will be retested on the remaining 70% of the data. Your final score at the end of the competition will only be based on the remaining 70% of the data; the 30% tested during the competition will have no effect. It is guaranteed that the 30% tested during the competition were chosen uniformly at random and are entirely disjoint from the 70% tested at the end. Therefore, the results on the 30% tested during the competition should be seen as a strong indicator of how well your solution performs. At the same time, it is detrimental to overfit your solution to the test data.
