Challenges

Published: 16.09.2016

Are you looking for an internship or thesis topics? Great!

In order to apply, send your resume (or link to your github account) and your solution for at least one of the following puzzles.

1. The Tile Challenge

Data can be messy! The ability to organize pieces of information to get useful insights is essential to a data scientist.

In this challenge, your task is to reconstruct the correct image from the messy data provided in the link below. The solution reveals the habitat of the best data scientists in Estonia ūüėČ

Download the image in the link below, use your favorite programming language, and good luck!

Image link: puzzle.png

Send us your final image and the code you created to solve it. (image in png format, code in text format)

2. Text analysis with Estnltk toolkit

Your task is to download the article from http://www.sirp.ee/s1-artiklid/c21-teadus/kvantilm/ and answer the following questions:

  • What is the number of unique words and lemmas in the text?
  • What are the most frequently mentioned person names?
  • What is the distribution of parts of speech in the text?

To answer these questions, you will need to write a Python script which loads a html page, extracts an article body and does the necessary text analysis.

Resources:

Share this post

READ NEXT

1.04.2020 | NEWS
Are you a terrific Data Engineer? If yes, you should work at STACC! The data science company STACC knows how…
19.02.2020 | NEWS
With the support of EU money, Aurora Solutions O√ú is creating a VAT and customs declaration service platform that enables…
4.02.2020 | NEWS
Approximately 50 percent of children have not received all the recommended vaccinations by the time they start school, the largest…