Suivre ce blog Administration + Créer mon blog

Data mining

3 Mai 2021, 19:50pm

Data mining is an automated data search based on the analysis of huge amounts of information. The goal is the identification of trends and patterns, which is impossible with conventional analysis. Complex mathematical algorithms are used to segment data and estimate the likelihood of subsequent events.

This article introduces another website scraper. This is an additional, constantly evolving reader. It is intended for reading sites that the default Octoparse could not handle. Moreover, it can be refined both by developers, when the situation is very complicated and it is rather difficult to write a scraper, or by a specially trained system administrator who needs to copy the XPATH of the container with the content and the containers that need to be cut from the result.

If you are planning on developing your own web scraping system, we share one valuable observation. The scraping task is much more difficult than it seems at first glance. The number of problems and hidden pitfalls associated with it sometimes just rolls over. The examples given in this article are another confirmation of this. You can find out more details by visiting our blog. And you need to recognize problems in advance. Therefore, from the very beginning, you should think about the most practical logging system. Plus - use a feedback mechanism so that the system administrator quickly receives complaints about low-quality content.

Voir les commentaires