UTDEventData

UTDEventData: An R package to access political event data - Published in JOSS (2019)

https://github.com/katehyoung/utdeventdata

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
    2 of 5 committers (40.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

bigdata data-incentive political-event utd-server
Last synced: 6 months ago · JSON representation

Repository

An R package to retrieve political event data from the UTD API server

Basic Info
  • Host: GitHub
  • Owner: KateHyoung
  • License: lgpl-3.0
  • Language: R
  • Default Branch: master
  • Homepage:
  • Size: 2.75 MB
Statistics
  • Stars: 16
  • Watchers: 5
  • Forks: 7
  • Open Issues: 7
  • Releases: 1
Topics
bigdata data-incentive political-event utd-server
Created about 8 years ago · Last pushed over 3 years ago
Metadata Files
Readme License

README.md

UTDEventData ver. 1.0.0

DOI DOI

The UTDEventData R package provides an interface to extract data from the UTD Event Data server. This package is stable and actively maintained/updated. Your comments, feedback and suggestions are welcome.
If you have any question regarding the package, please contact Marcus Sianan Marcus.Sianan@UTDallas.edu, or open an issue (https://github.com/KateHyoung/UTDEventData/issues).

Note: Our server now provides the access to 'CLINEPHOENIXLNNYT' data that contains several million events from 17.5 million news stories from New York Times (1945 - 2019) that is provided by Open Event Data Alliance. You can find more information by clicking the link here.

This package is part of the "Modernizing Political Event Data for Big Data Social Science Research" project. More information can be found on the project webpage.

Several functions to preview and download data are listed below. More details of these methods are illustrated in the vignette.

  • citeData( ): for citing the package and data tables in the UTD server for publications
  • DataTables( ): for looking up data tables in the UTD server
  • tableVar( ): for looking up the variables of a data table
  • previewData( ): for previewing the data structure of a data table
  • pullData( ): for downloading data by countries and time periods
  • entireData( ): for downloading an entire data table
  • getQuerySize(): for measuring the size of requested data from the UTD server
  • sendQuery( ): for requesting built queries from the API server to download data
  • Table: a reference class

Leaf Query Block functions:

  • returnTimes( ): create a query block by time periods
  • returnCountries( ): create a query block by countries
  • returnLatLon( ): create a query block by latitude and longitude
  • returnDyad( ): create a query block of a dyad for both source and target actors
  • returnRegExp( ): create a query block by pattern of attributes in a data table

Branch Query Block functions:

  • orList( ): match records that satisfy any of the child query blocks
  • andList( ): match records that satisfy all of the child query blocks

Installation

Without the vignette: devtools::install_github("KateHyoung/UTDEventData")

With the vignette: devtools::install_github("KateHyoung/UTDEventData", build_vignettes=TRUE)

Users with newer versions of R may need to follow this format: install.packages("devtools") library(remotes) install_github("KateHyoung/UTDEventData") library(devtools) library(UTDEventData)

Retrieve an API key

Access to the UTD data server requires an API key. To obtain an API key, follow the link and fill the form: https://eventdata.utdallas.edu/signup. Please check your spam and junk email if you do not receive the API key in your inbox.

Using the API key

Method 1: Pass the key as the first argument

You will need to pass the key on every function call.
k <- '...your API key....' DataTables(utd_api_key = k)

Method 2: Store the key in an environment variable

Set the default API key by setting the environment variable UTDAPIKEY. ``` Sys.setenv(UTDAPIKEY = "...your API key...")

DataTables() tableVar(table = "icews", lword = "target") `` *Note: Method 2 currently works only withDataTabes(),tableVar(), andpreviewData()`. We plan to expand this method to other functions that require an API key.*

Further examples will assume the api key is set in an environment variable.

Data Preview

Retrieve a sample of 100 observations.
dataSample <- previewData(table_name = "PHOENIX_RT") View(dataSample)

Data Download (quick)

pullData() can be used to retrieve data subsetted by country names and dates. subset1 <- pullData(table_name = "phoenix_rt", country = list('canada','China'), start = '20171101', end = '20171102', T)

Data Download (custom)

More complex queries with intersections, unions and multiple sets of constraints may be submitted via the sendQuery() function. More details on this method are provided in the vignette.

Example Usage

``` dt <- pullData('utdapikey', "Phoenix_rt", list("RUS", "SYR"), start="20180101", end="20180331", citation = F)

querying the fight event by CAMEO codes

Fgt <- dt[dt$code %in% c("190", "191", "192", "193", "194", "195", "1951", "1952", "196"),] Fgt <- Fgt[,1:23] ## remove url and oid columns

tb <- table(Fgt$country_code, Fgt$month) # monthly incidents

barplot(tb, main = "Monthly Fight Incidents between RUS and SYR", col=c("darkblue", "red"), legend = rownames(tb), beside=TRUE, xlab="Month in 2018") ```

{width=70%}

Military related fights between Russia and Syria from January 2018 to March 2018 are depicted by month. Event types are articulated by CAMEO codes in Phoenix real-time data.

Vignette

Access the vignette by executing the following R snippet. This requires an initial package installation with build_vignette=TRUE.

vignette("UTDEventData") Alternatively, download the PDF version here

Authors

Marcus Sianan Marcus.Sianan@UTDallas.edu (Maintainer)

Dr. Patrick T. Brandt pbrandt@utdallas.edu
Dr. Vito D'Orazio dorazio@utdallas.edu
Dr. Latifur Khan lkhan@utdallas.edu
Dr. HyoungAh(Kate) Kim kate0550@gmail.com
Michael J. Shoemate michael.shoemate@utdallas.edu
Sayeed Salam sxs149331@utdallas.edu
Jared Looper jrl140030@utdallas.edu

Community Guidelines

This project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms. Feedback, bug reports, and feature requests here. You may request to store a dataset in the UTD Event Data server by contacting one of the authors. Those who request to store data as collaborators also agree to abide by its terms specified in the Contributor Code of Conduct.

License

GPL-3
This package is supported by the RIDIR project funded by National Science Foundation, Grant No. SBE-SMA-1539302.

JOSS Publication

UTDEventData: An R package to access political event data
Published
April 22, 2019
Volume 4, Issue 36, Page 1322
Authors
HyoungAh Kim ORCID
School of Economic, Political and Policy Sciences, University of Texas at Dallas
Vito D’Orazio ORCID
School of Economic, Political and Policy Sciences, University of Texas at Dallas
Patrick T. Brandt ORCID
School of Economic, Political and Policy Sciences, University of Texas at Dallas
Jared Looper ORCID
School of Economic, Political and Policy Sciences, University of Texas at Dallas
Sayeed Salam ORCID
Department of Computer Science, University of Texas at Dallas
Latifur Khan ORCID
Department of Computer Science, University of Texas at Dallas
Michael Shoemate ORCID
School of Natural Sciences and Mathematics, University of Texas at Dallas
Editor
Alex Hanna ORCID
Tags
political event data real-time political events data-intensive social research big data

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 220
  • Total Committers: 5
  • Avg Commits per committer: 44.0
  • Development Distribution Score (DDS): 0.077
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
HyoungAh Kim k****0@g****m 203
MarcusMMS m****8@u****u 14
Shoeboxam s****m@g****m 1
Daniel S. Katz d****z@i****g 1
Andrew Heiss a****s@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 8
  • Total pull requests: 11
  • Average time to close issues: 9 months
  • Average time to close pull requests: 16 days
  • Total issue authors: 6
  • Total pull request authors: 5
  • Average comments per issue: 3.25
  • Average comments per pull request: 0.09
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ysapolovych (2)
  • babakrezaee (2)
  • andrewheiss (1)
  • lsyaseen (1)
  • moneymunch (1)
  • IshitaGopal (1)
Pull Request Authors
  • KateHyoung (6)
  • MarcusMMS (2)
  • andrewheiss (1)
  • Shoeboxam (1)
  • danielskatz (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

DESCRIPTION cran
  • countrycode * imports
  • curl * imports
  • jsonlite * imports
  • methods * imports
  • rjson * imports
  • stats * imports
  • knitr * suggests
  • rmarkdown * suggests