Ostrava, CZ
před 6 dny
OverviewIDC is seeking a Manager, Webscraping and Data Harvesting Team who will be responsible for leading our recently established team that is web crawling and gathering data from the internet.

In this global role, the primary focus will be deploying web crawling technology to collect structured and unstructured data from hundreds of different sites on a specific schedule, data cleaning, classifying, validating and unifying based on business rules and given taxonomy.

It also included enriching the data with other various information and integrating into existing products and internal business processes.

ResponsibilitiesManage an agile team responsible for web crawling and data gathering for our largest data product lineEvaluate, create and deploy web crawling technologyDevelop machine learning algorithms with Natural Language Processing focus to clean, classify, and match gathered data to existing taxonomyWork with internal business stakeholders to integrate scraped data into existing research process and proprietary systemsWork cross-

departmentally to define metrics, guidelines and strategies to measure data coverage and its qualityBe part a global team to design and build new products that aggregate and visualize scraped data from various data sourcesQualificationsBachelors Degree or equivalent in Mathematics, Computer Science, Statistics or Information Management4+ years of experience in web scraping, data management, analysis or functions related to data analysis problemsExperiencing using technologies including Python-

Scrapy, Octoparse, Mozenda, NLTK, PostgreSQLDemonstrated strong team management and collaboration skillsExperienced with project management practices and the ability to manage multiple projects simultaneouslyWhat we are offering?

International environment with daily usage of EnglishTime for self-development + other trainingsFlexible working hours5 weeks of vacations3 Sick daysAttractive compensation packageAnnual bonusCafeteria system+ lot moreThis position may be based Ostrava, Czech Republic

