thefork-scraping-code
the fork assignment for your data analysis courses
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.3%) to scientific vocabulary
Repository
the fork assignment for your data analysis courses
Basic Info
- Host: GitHub
- Owner: fbietti
- Language: R
- Default Branch: main
- Size: 71.3 KB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
the fork scraping code
This project arose from the need to find new topics for my data analysis assignments at the university. The idea was to identify data online, structure it into a dataset, then conduct analyses and engage the students in practical exercises!
In this project, you will find a file for web scraping data from The Fork website. It is a Python code and is well-commented. It does not require prior knowledge of HTML or CSS. You just need to know how to identify the element you are interested in on the site and copy the corresponding code into the for loop.
File: scraping_code
This file contains the code for performing web scraping. First, you need to go to The Fork website and initiate your search. In my case, I was interested in Parisian pizzerias, so I launched a search with the keyword 'pizzeria.' The site responded with 168 pages of results. Each page contains multiple results, and each result corresponds to a pizzeria. Each result includes various elements, such as the address, price, pizzeria name, etc.
You should inspect the element you are interested in. In my case, I focused on the pizzeria name, rating, price, review, and address. I initialized lists for each of these elements and also set up page numbers so that the code could navigate from one page to another.
Next, using the lists, I created a dictionary, and I transformed the dictionary into a data frame. Finally, with the data frame, I exported a CSV file.
File: celaningdataset
This file contains R commands to effectively structure our database. The manipulations involve extracting the district number from the address, enabling the comparison of average prices for pizzerias in different neighborhoods of Paris.
File: assignmetpizzaprice_paris
This file contains an exercise that, if you are a teacher, you can give to your data analysis students. I wrote it in R, but it can be easily translated into Python if you prefer. It starts with some commands to change variable types or find individuals, etc., and continues with recoding and bivariate analyses. I ask students to create tables, formulate hypotheses, test them, and create a graph. The file contains the answers to each question

Owner
- Login: fbietti
- Kind: user
- Repositories: 1
- Profile: https://github.com/fbietti
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Bietti
given-names: Federico
orcid: https://orcid.org/0000-0002-3912-3951
title: "The price of pizza in Paris: an example of an assignment in data analysis using The Fork website"
version:
identifiers:
- type:
value:
date-released: 2023-01-23
GitHub Events
Total
- Push event: 1
Last Year
- Push event: 1