Problem F
Scribble
Languages
en
sv
Your friend Harry likes to scribble. To save paper, he draws multiple things on top of each other. You found his old sketchbook with clean drawings and want to figure out what he’s scribbled in his messy composite images.
Harry always draws exactly 3 objects together, any fewer leaves too much space, any more gets too messy. He draws from 50 different object types.
Your task is to identify which 3 objects are in each composite image.
For this task we have given you training data and it’s not allowed to find your own training data on the internet.
Input
Download the file with test and training data. This can be found at the bottom under "attachments". You will receive a zip file containing:
-
train.csv - Paths to Harry’s clean drawings and their labels
-
test.csv - Paths to Harry’s composite scribbles
-
train_images/ - Harry’s clean individual drawings
-
test_images/ - Harry’s composite scribbles (3 objects merged)
-
image_merge.py - Example script showing how Harry creates his composite scribbles
Output
For each test image, you should output exactly 1 line with 3 object names separated by spaces. The order of the objects in your output does not matter.
Example
Input: Two composite scribbles:
-
First image contains: beach, bear, arm
-
Second image contains: coffee_cup, diving_board, guitar
Output:
beach bear arm coffee_cup diving_board guitar
Scoring
Your solution will be evaluated based on how accurately you identify the 3 objects in each merged image.
The scoring is calculated as follows:
-
For each test image, you get 1 point for each correctly identified object
-
You can get a maximum of 3 points per test image
-
Your final score is based on the total number of correctly identified objects divided by the total possible points (3 * number of test images)
Your final points are calculated as following:
If $S$ is the total number of correctly identified objects across all test images divided by the total possible points (in essence, your accuracy), then your final points are:
\[ \text{Points} = \max \left(0, \min \left(100, \frac{S}{0.65} \times 100 \right)\right) \]At the end of the competition, all solutions will be retested on the remaining 70% of the data. Your final score at the end of the competition will only be based on the remaining 70% of the data; the 30% tested during the competition will have no effect. It is guaranteed that the 30% tested during the competition were chosen uniformly at random and are entirely disjoint from the 70% tested at the end. Therefore, the results on the 30% tested during the competition should be seen as a strong indicator of how well your solution performs. At the same time, it is detrimental to overfit your solution to the test data.
