Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: Leeds-CDRC
  • License: gpl-3.0
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 12.2 MB
Statistics
  • Stars: 1
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created almost 4 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

Eatwell Classification Tool

DOI

Overview:

This tool classifies food items to food group segments of the UK’s EatWell Guide. It is designed to aid automated food group classification for big data sources, such as grocery retailer transaction records.

Version 1.0

This version of the Eatwell classification tool takes product information e.g. (product name, description, shelving categories) and uses the developed text matching algorithms to assign the food product to a segment of the Eatwell Guide. To reflect real-world baskets in addition to the five standard segments defined in the Eatwell guide products can also be classified as an alcoholic beverage, non-alcoholic beverage, discretionary food, composite food, baby/toddler foods, other (e.g. spices and flavouring) or non-food items (i.e. items that may be purchased alongside food items such as kitchen foil, tooth paste etc.). The full category descriptions, logic behind their inclusion and examples are given in Table 1.

|Category |Detail |Example(s)| |---------|-------|--------| |Fruit and Vegetables |Eatwell food category, recommended to be 39% of food consumed (by weight) | Carrots, Apple, Kiwi, Salad | |Potatoes, bread, rice, pasta and other starchy carbohydrates |Eatwell food category, recommended to be 37% of food consumed (by weight) | Wholegrains, Porridge, Cous cous, Cereals | |Beans, pulses, fish, eggs, meat and other proteins|Eatwell food category, recommended to be 12% of food consumed (by weight) | Lentils, Chickpeas, Meat, Fish, Eggs| |Dairy and alternatives|Eatwell food category, recommended to be 8% of food consumed (by weight) |Milk, Cheese, Soya milk | |Oils and spreads|Eatwell food category, recommended to be 1% of food consumed (by weight) |Olive oil, Sunflower spread | |Discretionary Foods |Corresponds to those foods that should be eaten less often and in small amounts (Remaining 3% of foods consumed by weight) |Cakes, Crisps, Biscuits, Chips,| |Alcoholic Beverages | Alcoholic drinks (not included in Eatwell guidance)|Wines, Beers, Spirits | |Non-alcoholic Beverages | Non-alcoholic drinks (not included in Eatwell guidance)- user discretion to include as discretionary where appropriate |Squash, Cordial, Juice, Fizzy drinks| |Composite foods| Foods that are made up of foods in more than one category[^1] |Ready meals, Lasagne, Quiche | |Toddler and baby food | Toddlers and babies have different diary recommendations to the Eatwell Guide therefore are separated out for ease |Formula, baby purees | |Other foods |Food items without a significant nutritional contribution i.e. flavorings, herbs, spices, |Dried herbs and spices, pepper, salt | |Non-food items |Products potentially erroneously included as they are typically purchased alongside a food shop| Kitchen foil, Toothpaste, Homeware|

Table 1.: Overview of the food categories used in the Eatwell Classification Tool

[^1]: The user can decide how to handle these composite foods dependent on the research question being asked, later versions will assist in claucalitng fruit and vegetable portions in these food groups.

How the Algorithm works

The text mining algorithm uses an iteratively developed lexicon to assign the product of interest to one of the extended Eatwell categories outlined in table 1. The algorithm first matches to N number of categories and then uses rules based on expert domain knowledge to assign the final category. Matching justifications are provided and are modifiable by the user for transparency.

  • E.g. “Eton Mess: Strawberries and Meringue” would match to two categories: Fruit and Vegetables and Discretionary, however as one of the rules is that any product with a Discretionary element is classified as such, therefore the final Eatwell Category assigned would be discretionary.

  • E.g. “Garden Salad: Lettuce, Tomato, Cucumber” would match four times to the Fruit and Vegetable Eatwell Category so would be assigned to that category and an indication of high probability of correct classification given.

Algorithm Development

Using real world product data, the algorithm has been designed iteratively to capture a wide range of products. To ensure commercial sensitivity brand names are not used to inform classification, however there is the option for users to assign brand items to an Eatwell category to improve business specific classification. The algorithm and underlying database will continue to be updated to further improve product classification.

Caveats

Assumptions on the data may need to be modified dependent on end use It is recommended that all classifications are validated against nutritional information. We have produced interactive visualisations (see notebook___) to assist in visual validation of the data. Check back regularly for code updates

Upcoming .. version (2.0)

  • An interactive web dashboard is planned for version 2.0 to allow the use of the Eatwell classification tool without programming experience.

Owner

  • Name: Leeds-CDRC
  • Login: Leeds-CDRC
  • Kind: organization

Citation (citation.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Eatwell Classification Tool
message: >-
  Data provided by the Consumer Data Research Centre (CDRC)
  an ESRC Data Investment refs ES/L011840/1; ES/L011891/1
type: software
authors:
  - given-names: Francesca
    family-names: Pontin
    email: f.l.pontin@leeds.ac.uk
    affiliation: University of Leeds
    orcid: 'https://orcid.org/0000-0002-7143-8718'
  - name: Consumer Data Research Centre
    city: Leeds
identifiers:
  - type: doi
    value: 10.5281/zenodo.7074554
repository-code: >-
  https://github.com/Leeds-CDRC/Eatwell_product_classification/tree/main
abstract: >-
  This version of the Eatwell classification tool takes
  product information e.g. (product name, description,
  shelving categories) and uses the developed text matching
  algorithms to assign the food product to a segment of the
  Eatwell Guide. To reflect real-world baskets in addition
  to the five standard segments defined in the Eatwell guide
  products can also be classified as an alcoholic beverage,
  non-alcoholic beverage, discretionary food, composite
  food, baby/toddler foods, other (e.g. spices and
  flavouring) or non-food items (i.e. items that may be
  purchased alongside food items such as kitchen foil, tooth
  paste etc.).
keywords:
  - Eatwell Guide
  - Diet & Nutrition
  - Supermarket data
license: GPL-3.0
version: '1.0'
date-released: '2022-09-13'
preferred-citation:
  type: software
  title: "Eatwell Classification Tool"
  authors:
  - given-names: "Francesca"
  - family-names: "Pontin"
  - orcid: "https://orcid.org/0000-0002-7143-8718"
  doi: "10.5281/zenodo.7074554"
  month: 9
  year: 2022
  version: 1
  collection-title: "Eatwell Classification Tool"
  url: "https://github.com/Leeds-CDRC/Eatwell_product_classification/tree/main"

GitHub Events

Total
  • Push event: 15
Last Year
  • Push event: 15