Manual retail price observations discontinued
Every month, CBS collects information on the prices of a large number of products and services. These data are used to calculate the consumer price index (CPI), an important indicator of consumer price inflation. For many years, the source data were collected from shops around the country by CBS interviewers. They would enter the shops with a ‘shopping list’ of representative products to check and record the prices there and then. This was a labour-intensive process. However, it was gradually scaled back over the past few years. Jos van Linden is familiar with the ins and outs of this process. She has worked as an interviewer for CBS for the past 20 years and visited a number of stores each month. ‘I’ve always enjoyed this work. In some of the stores, I was even a regular visitor for ten years on end. It allowed me to build a close relationship with the shopkeepers.’
Pleasant job
What did the interviewer’s job entail? ‘I’d report to the shopkeeper and then make my round through the shop. Usually, I would record about 20 to 25 different products. We always tried to make sure our work gave the smallest possible disruption to shopkeepers. For example, the customer would always go first and we’d pick the most convenient times to visit shops, for instance on a Wednesday or Thursday morning.’ Van Linden did notice how her work changed over the past few years. ‘We knew that more and more shops no longer required visiting because CBS started receiving the scanner data directly from those shops. For the shopkeepers, this was a welcome development. As for me, I find it a pity. To us interviewers, it was always a pleasant job and the response was guaranteed.’
Major step
On 13 December 2019, CBS interviewers made their rounds recording retail prices for the Dutch CPI for the very last time. Discontinuing store price observations signifies a major step for CBS. The fact that prices are now collected via scanner data and web scrapers ensures an even higher quality of the consumer price index as well as cost savings. In addition, the shopkeeper is no longer inconvenienced. Van Linden: ‘We will however continue the regular visits in April and November to shops in a number of large cities including clothes shops and furniture shops, supermarkets as well as restaurants and cafés, in order to compile price level indices at the European level (an obligation imposed by the EU’s statistical authority, ed.). But there as well, the number of price observations is decreasing. Fortunately, CBS still has sufficient other surveys which need to be carried out and which require face-to-face interviews. For example, surveys on health perceptions, changes taking place in society, lifestyle surveys, etc.’
‘We always tried to make sure our work gave the smallest possible disruption to shopkeepers’
Scanner data and web scrapers
Koen Link, statistical researcher at CBS, explains how CBS is now able to collect all necessary information on retail prices in an efficient manner using scanner data and web scraping methods: Link: ‘In 2003, CBS put into use supermarket scanner data for the first time. These are cash register data including the number of items sold and the turnover per product. This is how we determine the price per barcode. The use of these scanner data has taken off in recent years. In addition, CBS uses so-called web scrapers and robot tools. We let computer programs retrieve the information we need from the websites of clothes shops. The same is done for shoe shops and furniture shops as of 1 January 2020. As for barbershops, we use a robot tool that sends a signal once prices on the website are changed. Smaller shops nowadays have their own websites as well. Then, if one or two shops are missing in our data and we really need these data, we will send them a questionnaire.’
New statistical methods
For proper processing of the scanner and online data, CBS has developed special statistical methods and new practices. The use of web scrapers provides datasets of an enormous size, for instance. Proper classification of those data is important. Link explains: ‘In clothing, we need to make a distinction between dresses, skirts, short sleeves, long sleeves, etc. CBS uses name filters and checks the names of product groups instead of individual products. This enables us to categorise the products in a better way. The advantage of this new practice is that we arrive at a much more accurate figure, as we now incorporate all the data rather than a mere selection of items that we placed in the interviewers’ shopping basket by way of sampling.’
Ongoing development and innovation
By 2019, observations were already limited to 420 shops per month. These were mainly shoe shops, furniture shops and smaller specialist shops. That is all over now as well. Link says: ‘As we are expanding the use of scanner data and web scrapers, this type of store observation is not necessary anymore. We are the first statistical office within the euro area to discontinue store price observations altogether. That’s a milestone. Eventually, we aim to implement machine learning techniques to classify products into specific categories. We are working on that now. It’s a process of ongoing development and innovation.’
Related items
- Article - Working with businesses to collect CPI data
- News release - Consumer prices 2.7 percent up in December
- News release - Fastest rise in consumer prices since 2003