Login
Login

APHRC Online Microdata Library
  • Home
  • About APHRC
  • Datasets
  • Collections
  • Citations
  • Resources
  • How to use it?
  • Why sharing data?
  • Contact us
    Home / Central Data Catalog / DATA_SCIENCE_AND_EVALUATION / DDI-KEN-APHRC-SUPERMARKET-A-2023-V1.0
Data_Science_and_Evaluation

Analysis of Supermarket Grocery Data for Prediction of Nutritional and Health Outcomes at the Population Level - Supermarket A

Kenya, 2020 - 2023
Data Science and Evaluation (DSE)
Agnes Kiragga
Last modified June 25, 2025 Page views 124 Documentation in PDF Metadata DDI/XML JSON
  • Study description
  • Documentation
  • Data Description
  • Get Microdata
  • Identification
  • Version
  • Scope
  • Coverage
  • Producers and sponsors
  • Sampling
  • Data Collection
  • Data Processing
  • Data Appraisal
  • Data access
  • Disclaimer and copyrights
  • Metadata production

Identification

IDNO
DDI-KEN-APHRC-SUPERMARKET-A-2023-V1.0
Title
Analysis of Supermarket Grocery Data for Prediction of Nutritional and Health Outcomes at the Population Level - Supermarket A
Country
Name Country code
Kenya KEN
Abstract
Rates of overweight, obesity, and chronic diseases such as cardiovascular diseases, hypertension, type 2 diabetes and certain cancers (bowel, lung, prostate and uterine) are on the rise in most sub-saharan Africa (SSA) countries like kenya. These increases can be largely attributed to the shift toward unhealthy diet patterns and increased access to processed foods that are high in fat, sugar, and sodium. The influx of supermarkets in east africa and the replacement of traditional foods for processed foods places this region in a vulnerable position for greater increases in chronic disease rates. Consumer purchasing history from supermarkets can provide valuable insight to food intake over time and the present and future effects on chronic diseases. Purchasing data from supermarkets is available yet underutilized in SSA.

The study aimed to harmonize and increase accessibility to grocery data, use statistical methods to explore purcharing patterns and predict the effects of nutrition on chronic diseases, and inform policy on the various influences on consumer purchases.

Version

Version Date
2025-06-23
Version Notes
Not Applicable

Scope

Keywords
Keyword Vocabulary URI
Supermarkets MeSH https://www.ncbi.nlm.nih.gov/
Ultra-Processed Foods MeSH https://www.ncbi.nlm.nih.gov/
Processed Foods MeSH https://www.ncbi.nlm.nih.gov/
Non-communicable Chronic Diseases MeSH https://www.ncbi.nlm.nih.gov/

Coverage

Geographic Coverage
National coverage: Nairobi, Nakuru, Kajiado, Machakos and Kirinyaga.
Unit of Analysis
Individuals and supermarket transaction records.
Universe
The survey covers transaction records of individuals who made purchases in supermarkets.

Producers and sponsors

Authoring entity/Primary investigators
Agency Name Affiliation
Agnes Kiragga African Population and Health Research Centre (APHRC)
Producers
Name Affiliation Role
Steve Cygu African Population and Health Research Center (APHRC) Co-Investigator - Study coordination and Co-lead
Maureen Ng’etich African Population and Health Research Center (APHRC) Co-Investigator - Study coordination and Co-lead
Lindsey English African Population and Health Research Center (APHRC) Co-Investigator - Supporting methods of data mapping, analysis, and nutrition policy
Reinpeter Momanyi African Population and Health Research Center (APHRC) Co-Investigator - Supporting methods of data mapping and analysis
Elizabeth Kimani African Population and Health Research Center (APHRC) Co-Investigator - Support methods of data analysis and co-lead in policy analysis
Gershim Asiki African Population and Health Research Center (APHRC) Co-Investigator - Support methods of data analysis and co-lead in policy analysis
Funding Agency/Sponsor
Name Abbreviation Role
African Populattion and Health Research Center APHRC Funder (Big Idea)
Other Identifications/Acknowledgments
Name Affiliation Role
Bonface Ingumba African Population and Health Research Center (APHRC) Data Governance Officer
Shem Mambe African Population and Health Research Center (APHRC) Data Documentation Officer

Sampling

Sampling Procedure
The study is a cross-sectional exploratory study with a phased approach employing quantitative secondary data collection from a third-party information management solution provider. The third party provider employs an open integrated point of sale and store information retail system that connects retail touch points and sales channels in several counties in Kenya.

Sampling was conducted after a census of all supermarkets subscribed to the third party system was done. Only those counties with supermarkets subscribed to the platform were sampled. A sample of large, medium sized and small supermarkets were selected to participate in the study. The supermarket sizes were determined as follows; large supermarkets ( supermarkets with a cumulative total of more than 8 branch networks). Medium size supermarkets will be those with 3-8 branch networks in the counties and smaller supermarkets are those with 1-2 branch networks.


Grocery data was received from 10 supermarket chains.
Deviations from the Sample Design
Not Applicable
Response Rate
Not Applicable
Weighting
Not Applicable

Data Collection

Dates of Data Collection (YYYY/MM/DD)
Start date End date Cycle
2020-11-07 2023-12-31 Supermarket A
Mode of data collection
Other [oth]
Supervision
Not Applicable
Type of Research Instrument
A standardized form was developed to guide in extration of information from 3rd party information provider for supermarket purchase data. Variables of interest includes supermarket name, supermarket branch, location of supermarket, invoice id, customer id, customer demographics (gender, age), date and time of purchase, product name purchased, unit price per item, number of items purchased, payment method used by customer for purchase etc.

Secondary data collected will not be identifiable as it will be anonymized at the supermarket and client level.

The standardized form is provided as external resources data.
V1-V24 the questions are found in the “Study abstraction tool”
V25-V27 are generated classifications (user developed) and are not in any resource
V28 the questions are found in the “NOVA-Classification-Reference-Sheet”
V29-V56 the questions are found in the “Kenya Food Composition Tables 2018”

Data Processing

Cleaning Operations
Not Applicable
Other Processing
The extracted grocery data was in the form of csv files and was saved into a local database using postgresql version 15.2 and imported into r version 4.3.3 for cleaning and pre-processing.

Data pre-processing techniques applied included: transactions and demographics alignment, dealing with missing values, checking for data consistency, quality assurance checks and filtering non-food items.

After data pre-processing, we applied the NOVA food classification and combined the purchase data with Kenya Food Composition Tables (KFCT). We further developed a classification of nineteen food groups from the food purchases.

Data Appraisal

Estimates of Sampling Error
Not Applicable

Data access

Contact
Name Email URI
African Population and Health Research Center datarequests@aphrc.org/info@aphrc.org aphrc.org
Conditions
APHRC data access condition

All non-APHRC staff seeking to use data generated at the Center must obtain written approval to use the data from the Director of Research.
This form is developed to assess applications for data use and facilitate responsible sharing of data with external partners/collaborators/researchers. By entering into this agreement, the undersigned agrees to use these data only for the purpose for which they were obtained and to abide by the conditions outlined below:

1.Data Ownership:
The data remain the property of APHRC; any unauthorized reproduction and sharing of the data is strictly prohibited. The user will, therefore, not release nor permit others to use or release the data to any other person without the written authorization from the Center.

2.Purpose:
The provided data must be used for the purpose specified in the Data Request Form; any other use not specified in the form must receive additional or separate authorization.

3.Respondent Identifiers:
The Center is committed to protecting the identity of the respondents who provide information in its research. All analytical data sets (both qualitative and quantitative) released by the Data Unit MUST are stripped of respondent identifiers to protect the identity of the respondents. By accepting to use APHRC data, the user is pledging that he/she will not, under any circumstance, regenerate the identifiers or permit others to use the data to learn the identity of any individual, household or community included in any data set.

4.Confidentiality pledge:
The user will not use nor permit others to use the data to report any information in the data sets that could identify, directly or by inference, individuals or households.

5.Reporting of errors or inconsistencies:
The user will promptly notify the Head of the Statistics and Survey Unit any errors discovered in the data as soon as the errors are discovered.

6.Publications resulting from APHRC data:
The Center requires external collaborators to work with APHRC staff on all publications resulting from its data. In order to facilitate this, lead authors should send a detailed concept note of the paper (including the background, rationale, data, analytical methods, and preliminary findings) to the Principle Investigator (or Theme Leader) for the project (with a copy to the Director of Research), who will circulate the abstract to concerned researchers for possible expression of interest in participating in the publication as co-authors. Any exception to the involvement of APHRC staff should be approved by the Director of Research, APHRC.

7.Security:
The user will take responsibility for the security of the data by ensuring that the data are used and stored in a secure environment where access is password protected. This will ensure that non-authorized people should not have access to the data.

8.Loss of privilege to use data:
In the event that APHRC determines that the data user is in violation of the conditions for using the data, or if the user wishes to cancel this agreement, the user will destroy the data files provided to him/her. APHRC retains the right to revoke this agreement or informs publishers to withhold publication of any work based wholly or in part on its data if the conditions for using the data are violated.

9.Acknowledgement:
Any work/reports from this data must acknowledge APHRC as the source of these data. For example, the suggested acknowledgement for NUHDSS data is:
"This research uses livelihoods data collected under the longitudinal Nairobi Urban Health and Demographic Surveillance System (NUHDSS) since 2006. The NUHDSS is carried out by the African Population and Health Research Center in two slums settlements (Korogocho and Viwandani) in Nairobi City."Additionally all funders, the study communities that provided the data, and staff who collected and analyzed or processed the data should be acknowledged.

10.Deposit of Reports/Papers:
The user should submit electronic and paper copies of all publications generated using APHRC data to the Policy Engagement and Communications Department, with copies to the Director of Research.

11.Change of contact details:
The user will promptly inform the Director of Research of any change in your personal details as contained on this data request form.
Citation requirement
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download

Disclaimer and copyrights

Disclaimer
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
Copyright
Copyright © APHRC, 2025

Metadata production

Document ID
DDI-KEN-APHRC-SUPERMARKET-A-2023-V1.0
Producers
Name Abbreviation Role
African Population and Health Research Center APHRC Documentation of the DDI
Date of Production
2025-06-23
Document version
Version 1.1 (June 2025)
APHRC Microdata Portal

© APHRC Microdata Portal, All Rights Reserved. Slot Online