Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 8 DOI reference(s) in README
  • Academic publication links
    Links to: sciencedirect.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (4.4%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: warint
  • Default Branch: master
  • Size: 4.06 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 4 years ago · Last pushed almost 3 years ago
Metadata Files
Readme Citation

README.html














README












































tiger_woods

This repo contains a comprehensive list of Tiger Woods’ speeches, which he gave during his tournaments’ participation. Some sentiment scores have been computed using various Natural Language Processing techniques.

Cite

Please cite as :

Pastoriza, David, et Thierry Warin. « Dataset of Two Decades of Tiger Woods Press Conferences and Tournament Performance ». Data in Brief 41 (1 avril 2022). https://doi.org/10.1016/j.dib.2022.107955.

@article{PASTORIZA2022107955, title = {Dataset of two decades of Tiger Woods press conferences and tournament performance}, journal = {Data in Brief}, volume = {41}, pages = {107955}, year = {2022}, issn = {2352-3409}, doi = {https://doi.org/10.1016/j.dib.2022.107955}, url = {https://www.sciencedirect.com/science/article/pii/S2352340922001664}, author = {David Pastoriza and Thierry Warin}, keywords = {Tiger Woods, Press conferences, Sentiment analysis, Natural language processing, Machine learning}, abstract = {This data article describes a dataset that allows exploring the determinants of superstars’ sentiment in tournaments. It consists of 1,284 press conferences of Tiger Woods in the PGA Tour between 1996 and 2020. We used natural language processing, a form of artificial intelligence, to extract and encode in a quantitative form the sentiment in Tiger Woods press conferences both before the tournament and after the rounds played. Additionally, the dataset provides a series of variables that describe Tiger Woods’ scoring and performance momentum in each round and variables that describe health-related and off-the-course issues that could affect his performance on the course. This data can be useful to understand the sentiment that superstars go through before important tournaments, their sentiment following a major victory or defeat, how that sentiment evolves throughout their athletic career, and how sentiment is associated with performance momentum.} }

Description of the variables in the dataset

Variable Type Description
Tournament_Year Numeric Yearly season
Tournament_Order Ordinal Chronological order of tournaments within a season (i.e., smaller numbers took placer earlier in the season)
Permanent_Tournament_Number Numeric Identification number that is unique to that tournament, regardless of the sponsor/name of the tournament that may change over the years
Course_Number Numeric Course number that does not change over time (i.e., different tournaments may be played on the same course)
Player_Number Numeric Player identification number. Does not change over the years
Round_Number Numeric Round number. PGA Tour tournaments generally have four rounds
Event_Name Numeric Name of the tournament
Course_Name Text Name of the course in which the tournament took place
Interview_Text Text The integral text of the interview
Number_Of_Answers Numeric Number of answers provided by Tiger Woods during the Q&A section
Link Text Link to the original document (redirecting on ASAP Transcription website)
Response_Negative Numeric The negative score computed on Tiger Woods’ responses
Response_Positive Numeric The positive score computed on Tiger Woods’ responses
Response_Sentiment Numeric The subtraction between the positive and the negative scores computed on Tiger Woods’ responses
Round_Score Numeric Number of strokes of the round
End_of_Round_Pos_numeric_ Ordinal Player’s rank in the round (i.e., 1 means he is leading the tournament, 2 means he is the runner-up, etc.)
Total_Holes_Over_Par Numeric Number of holes in the round in which the player scored bogey or worse in the round
Birdies Numeric Number of birdies in the round
Birdies_Rank Ordinal Players’ rank in number of birdies in the round (i.e., 1 means he was the player with the highest number of birdies in the round)
Bogey_Avoidance_Rank Ordinal Player’s rank in terms of the number of holes in which the player saved a situation of bogey in the round
Driving_Distance_Rank Ordinal Player’s rank in driving distance in the round
Driving_Accuracy_Rank Ordinal Player’s rank in driving accuracy in the round
GIR_Rank Ordinal Player’s rank in number of greens in regulation in the round
Scrambling_Rank Ordinal Player’s rank in scrambling in the round (i.e., ability to recover from difficult situations)
Distance_to_leader_strokes Numeric Distance in strokes to the interim leader at the end of the round (i.e., if Tiger is trailing by two shots, it takes value 2; if Tiger is leading, the variable takes value 0)
Distance_to_leader_ranks Numeric Distance in ranks to the interim leader at the end of the round (i.e., if Tiger is in rank 3, it takes value 2; if Tiger is leading, it takes value 0)
Ranks_gained Numeric Equal to the rank at the end of Roundn minus the rank at the end of Roundn-1. For instance, if Tiger had a position 3 at the end of Roundn and 5 at the end of Roundn-1, the variable’s value is −2. It takes missing values for round 1
Strokes_gained_v_a_v_Leader Numeric Equal to distance to the leader (in strokes) at the end of Roundn minus distance to the leader (in strokes) at the end of Roundn-1. It takes missing values for round 1
Distance_to_runner_up_strokes_ Numeric Distance in strokes to the interim runner-up at the end of the round. For instance, if Tiger is leading by two shots, the variable should take value 2; if Tiger is co-leading, the variable should take value 0; if Tiger is neither leading nor co-leading, it takes missing value.
Strokes_gained_v_a_v_Runner_up Numeric Distance in strokes to runner-up at the end of Roundn minus distance to runner-up at the end of Roundn-1. Note that this variable should have a missing value for observations of round 1. Note that this variable should have a missing value when Tiger is not leading.
Injury Categorical {1; 0} Tiger had a minor injury when he entered the event
Major_injury_surgery Categorical {1; 0} Tiger Woods had a major injury when he entered the event
Personal_issues Categorical {1; 0} Tiger Woods had personal issues when he entered the event
Major Categorical {1; 0} Takes value 1 if tournament is a major (i.e., prestigious)
Prize_Money Numeric Prize money of the tournament
SoF Numeric Strength of the field of players in the tournament (i.e., the higher the number, the more competitive is the field)
OWGR Numeric OWGR of Tiger at the moment of the observation
url Text Link to the website from which the text is scraped
year Numeric Year of the tournament
Permanent_Tournament_Number Numeric Unique tournament identifier across years
Round_Number Ordinal Counter var for the round within each tournament
Interview_Text Text Scraped text that is analyzed
speech Ordinal Counter var for the nth article within a given year
fullart_bing_positive Numeric Positive sentiment score of the full article using the “Bing” dictionary
fullart_bing_negative Numeric Negative sentiment score of the full article using the “Bing” dictionary
fullart_bing_sentiment Numeric Overall sentiment score of the full article using the “Bing” dictionary (calculated by subtracting the negative score from the positive score)
fullart_afinn_positive Numeric Positive sentiment score of the full article using the “Afinn” dictionary
fullart_afinn_negative Numeric Negative sentiment score of the full article using the “Afinn” dictionary
fullart_afinn_sentiment Numeric Overall sentiment score of the full article using the “Afinn” dictionary (calculated by subtracting the negative score from the positive score)
fullart_nrc_positive Numeric Positive sentiment score of the full article using the “NRC” dictionary
fullart_nrc_negative Numeric Negative sentiment score of the full article using the “NRC” dictionary
fullart_nrc_sentiment Numeric Overall sentiment score of the full article using the “NRC” dictionary (calculated by subtracting the negative score from the positive score)
resp_bing_positive Numeric Positive sentiment score of the player responses in an article using the “Bing” dictionary
resp_bing_negative Numeric Negative sentiment score of the player responses in an article using the “Bing” dictionary
resp_bing_sentiment Numeric Overall sentiment score of the player responses in an article using the “Bing” dictionary (calculated by subtracting the negative score from the positive score)
resp_afinn_positive Numeric Positive sentiment score of the player responses in an article using the “Afinn” dictionary
resp_afinn_negative Numeric Negative sentiment score of the player responses in an article using the “Afinn” dictionary
resp_afinn_sentiment Numeric Overall sentiment score of the player responses in an article using the “Afinn” dictionary (calculated by subtracting the negative score from the positive score)
resp_nrc_positive Numeric Positive sentiment score of the player responses in an article using the “NRC” dictionary
resp_nrc_negative Numeric Negative sentiment score of the player responses in an article using the “NRC” dictionary
resp_nrc_sentiment Numeric Overall sentiment score of the player responses in an article using the “NRC” dictionary (calculated by subtracting the negative score from the positive score)
quest_bing_positive Numeric Positive sentiment score of the journalist’s questions in an article using the “Bing” dictionary (calculated as the positive sentiment score of the full article minus the positive sentiment score of the responses)
quest_bing_negative Numeric Negative sentiment score of the journalist’s questions in an article using the “Bing” dictionary (calculated as the negative sentiment score of the full article minus the negative sentiment score of the responses)
quest_bing_sentiment Numeric Overall sentiment score of the journalist’s questions in an article using the “Bing” dictionary (calculated as the overall sentiment score of the full article minus the overall sentiment score of the responses)
quest_afinn_positive Numeric Positive sentiment score of the journalist’s questions in an article using the “Afinn” dictionary (calculated as the positive sentiment score of the full article minus the positive sentiment score of the responses)
quest_afinn_negative Numeric Negative sentiment score of the journalist’s questions in an article using the “Afinn” dictionary (calculated as the negative sentiment score of the full article minus the negative sentiment score of the responses)
quest_afinn_sentiment Numeric Overall sentiment score of the journalist’s questions in an article using the “Afinn” dictionary (calculated as the overall sentiment score of the full article minus the overall sentiment score of the responses)
quest_nrc_positive Numeric Positive sentiment score of the journalist’s questions in an article using the “NRC” dictionary (calculated as the positive sentiment score of the full article minus the positive sentiment score of the responses)
quest_nrc_negative Numeric Negative sentiment score of the journalist’s questions in an article using the “NRC” dictionary (calculated as the negative sentiment score of the full article minus the negative sentiment score of the responses)
quest_nrc_sentiment Numeric Overall sentiment score of the journalist’s questions in an article using the “NRC” dictionary (calculated as the overall sentiment score of the full article minus the overall sentiment score of the responses)
sentimentr_fullart Numeric Sentiment of the full article using R’s sentimentr package
sentimentr_resp Numeric Sentiment of the player responses using R’s sentimentr package
sentimentr_quest Numeric Sentiment of the journalist’s questions using R’s sentimentr package (calculated as the sentimentr score of the full article sentimentr minus sentimentr score of the responses)

Owner

  • Name: Thierry Warin
  • Login: warint
  • Kind: user
  • Location: Montreal
  • Company: HEC Montréal

Professor of Data Science

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this dataset, please cite it as below."
authors:
- family-names: "Pasoriza"
  given-names: "David"
  orcid: "https://orcid.org/0000-0002-7664-5302"
- family-names: "Warin"
  given-names: "Thierry"
  orcid: "https://orcid.org/0000-0002-5921-3428"
title: "Dataset of two decades of Tiger Woods press conferences and tournament performance"
journal: "Data in Brief"
version: 0.1.0
doi: 10.1016/j.dib.2022.107955
date-released: 2022
url: "https://www.data-in-brief.com/article/S2352-3409(22)00166-4/fulltext"

GitHub Events

Total
Last Year