Detecting Fraud in Online Surveys by Tracing, Scoring, and Visualizing IP Addresses

Detecting Fraud in Online Surveys by Tracing, Scoring, and Visualizing IP Addresses - Published in JOSS (2019)

https://github.com/mahdlab/rip

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    1 of 4 committers (25.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords from Contributors

standardization
Last synced: 6 months ago · JSON representation

Repository

Detecting Fraud in Online Surveys by Tracing, Scoring, and Visualizing IP Addresses

Basic Info
  • Host: GitHub
  • Owner: MAHDLab
  • License: mit
  • Language: R
  • Default Branch: master
  • Homepage:
  • Size: 432 KB
Statistics
  • Stars: 25
  • Watchers: 5
  • Forks: 4
  • Open Issues: 2
  • Releases: 1
Created over 7 years ago · Last pushed over 1 year ago
Metadata Files
Readme Changelog Contributing License Code of conduct

README.md

rIP detects fraud in online surveys by tracing, scoring, and visualizing IP addresses

CRAN_Status_Badge metacran downloads DOI PRs Welcome Documentation

Takes an array of IPs and the keys for the services the user wishes to use (IP Hub, IP Intel, and Proxycheck), and passes these to all respective APIs. Returns a dataframe with the IP addresses (used for merging), country, ISP, labels for non-US IP Addresses, VPS use, and recommendations for blocking. Users also have the option to visualize the distributions.

Especially important in this is the variable "block", which gives a score indicating whether the IP address is likely from a server farm and should be excluded from the data. It is codes 0 if the IP is residential/unclassified (i.e. safe IP), 1 if the IP is non-residential IP (hostping provider, proxy, etc. - should likely be excluded), and 2 for non-residential and residential IPs (more stringent, may flag innocent respondents).

Note: rIP requires users to have active (free) accounts and/or valid keys at iphub, ipintel, and/or proxycheck. Users may pass any number of IP service keys to the function (e.g., 1, 2, or all 3). The function will work fine with any.

See some related working papers here and here. And the R-Bloggers post is here.

Code of Conduct

Please review and abide by our contributor code of conduct if you'd like to contribute to the tool. Thanks!

Installation

Users can install either the stable version released on CRAN (v1.2.0):

{R} install.packages("rIP") library(rIP)

Or the dev version directly from our GitHub repo:

{R} devtools::install_github("MAHDLab/rIP") library(rIP)

Usage

```{R}

Load the library

library(rIP)

Store personal keys for IP service pings (here we include only "ipHub" as an example)

iphubkey <- "MzI2MTpkOVpld3pZTVg1VmdTV3ZPenpzMmhodkJmdEpIMkRMZQ=="

Generate list of random IP addresses

ipsample <- data.frame(rbind(c(1, "30.139.234.173"), c(2, "105.21.175.134"), c(3, "221.167.14.219"), c(4, "205.218.125.55"), c(5, "191.231.0.156"), c(6, "95.107.54.16"), c(7, "244.206.230.230"), c(8, "210.38.216.32"), c(9, "17.120.223.85"), c(10, "146.153.75.77"), c(11, "246.149.59.225"), c(12, "86.77.82.141"), c(13, "89.151.46.115"), c(14, "229.123.227.10"), c(15, "21.175.8.185"), c(16, "187.193.209.68"), c(17, "74.52.31.169"), c(18, "255.99.244.220"), c(19, "149.106.54.194"), c(20, "244.214.245.239")))

Store in df

names(ipsample) <- c("number", "IPAddress")

Call the function (using only iphub service)

getIPinfo(ipsample, "IPAddress", iphubkey = iphub_key) ```

Visual Output from Function Call

Acknowledgements

We thank Tyler Burleigh, Bob Rudis, Ryan Jewell, and Nick Winter for their help on this tool.

Owner

  • Name: MAHDLab
  • Login: MAHDLab
  • Kind: organization
  • Location: Houston, TX

Machine-Assisted Human Decision-making (MAHD) Lab

JOSS Publication

Detecting Fraud in Online Surveys by Tracing, Scoring, and Visualizing IP Addresses
Published
May 23, 2019
Volume 4, Issue 37, Page 1285
Authors
Philip D. Waggoner
College of William & Mary, MAHD Lab, University of Houston
Ryan Kennedy
MAHD Lab, University of Houston
Scott Clifford
MAHD Lab, University of Houston
Editor
Yo Yehudi ORCID
Tags
Mechanical Turk fraud online surveys survey experiments quality

GitHub Events

Total
Last Year

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 195
  • Total Committers: 4
  • Avg Commits per committer: 48.75
  • Development Distribution Score (DDS): 0.113
Past Year
  • Commits: 1
  • Committers: 1
  • Avg Commits per committer: 1.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Philip Waggoner 3****r 173
rkennedy01 r****y@g****m 19
boB Rudis b****b@r****s 2
Daniel S. Katz d****z@i****g 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 10
  • Total pull requests: 5
  • Average time to close issues: 2 months
  • Average time to close pull requests: 40 minutes
  • Total issue authors: 7
  • Total pull request authors: 3
  • Average comments per issue: 1.5
  • Average comments per pull request: 0.8
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • pdwaggoner (4)
  • bshor (1)
  • hannah-solheim (1)
  • topa202020 (1)
  • ip2location (1)
  • chriscastille6 (1)
  • jpospina88 (1)
Pull Request Authors
  • pdwaggoner (3)
  • hrbrmstr (1)
  • danielskatz (1)
Top Labels
Issue Labels
enhancement (4) bug (2)
Pull Request Labels
enhancement (1)

Dependencies

DESCRIPTION cran
  • amerika * imports
  • dplyr * imports
  • graphics * imports
  • httr * imports
  • iptools * imports
  • jsonlite * imports
  • utils * imports
  • testthat >= 2.1.0 suggests