Data harvesting is the practice of collecting large amounts of data from online sources through the use of small computer programmes known as bots or scripts. Bots and scripts are programmed lists of instructions which automate actions on a device or a website.

Widespread social media use has meant for the collection of copious personal information, which is often the subject of data harvesting. Once collected, this data can be put to use or it can be packaged and sold to third parties, e.g. advertisers who use the data to target specific audiences.

Consent to collect user information is usually sought by devices or websites, but there have been notable examples of data harvesting without consent. Data privacy initiatives have sought to reduce the amount of personal information collected, in part so that data harvesting practices are less effective.

Data harvesting is not only an issue for individuals. Businesses compile large databases, which if harvested and reproduced could be used to undermine their business, e.g. by replicating databases and publishing them elsewhere online. Data harvesting can also impair website performance, which poses further threats to businesses.

The Cambridge Analytica scandal brought data harvesting to the public’s attention. In 2018, journalists discovered that a British consulting firm had collected data from tens of millions of Facebook profiles, without the consent of the users behind them. They sold this information to political campaigns who used it to advertise candidates and causes.

