Challenges

Published: 16.09.2016

Are you looking for an internship or thesis topics? Great!

In order to apply, send your resume (or link to your github account) and your solution for at least one of the following puzzles.

1. The Tile Challenge

Data can be messy! The ability to organize pieces of information to get useful insights is essential to a data scientist.

In this challenge, your task is to reconstruct the correct image from the messy data provided in the link below. The solution reveals the habitat of the best data scientists in Estonia 😉

Download the image in the link below, use your favorite programming language, and good luck!

Image link: puzzle.png

Send us your final image and the code you created to solve it. (image in png format, code in text format)

2. Text analysis with Estnltk toolkit

Your task is to download the article from http://www.sirp.ee/s1-artiklid/c21-teadus/kvantilm/ and answer the following questions:

  • What is the number of unique words and lemmas in the text?
  • What are the most frequently mentioned person names?
  • What is the distribution of parts of speech in the text?

To answer these questions, you will need to write a Python script which loads a html page, extracts an article body and does the necessary text analysis.

Resources:

Share this post

READ NEXT

26.01.2021 | NEWS
Data Engineer Team Lead STACC is the leading data science company in Estonia that develops machine learning models, artificial intelligence,…
8.12.2020 | NEWS
Activities 2016-2017 STACC built the first recommender system back in 2016. It was an e-mail recommender that was built to…
Profitability calculator
31.08.2020 | NEWS
Restrictions on the spread of the COVID-19 virus significantly changed people’s shopping behavior, and a large number of customers discovered…