Science Score: 18.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.4%) to scientific vocabulary
Repository
SD&D Class Project
Basic Info
Statistics
- Stars: 0
- Watchers: 3
- Forks: 3
- Open Issues: 8
- Releases: 0
Metadata Files
README.md
InfoRoots
A Software Design & Documentation Project at RPI
Team Members
- Siwen Zhang
- Joseph Om
- Jianing Lin
- Lei Luo
Vision Statements
Executive Summary
Fake news is prevalent over the internet, especially on climate change and vaccinations. The online average readers do not have enough time and knowledge to identify false information. InfoRoots uses automated information retrieval to fight against fake news and misinformation. Our web platform facilitates users to investigate online news and articles by providing analytic information about authors, publishers, and contents. By using InfoRoots, users will make accurate judgments on false information with small effort.
Market Potential
Fact-checking is the traditional approach to intervene in fake news. Fact-checking websites provide analytic reviews of news and factual claims by using journalism experts. PolitiFact, Snopes, and FactCheck.org are three mainstream fact-checking organizations, providing fact-checked articles on their websites. Instead of focusing on articles, NewsGuard generates professional reviews of online news sources and publishers. Users can use NewsGuard’s browser extension to check reviews of publishers when reading news and articles. Since professional fact-checking requires a large amount of time, they cannot cover every claim and article over the internet. Crowdsourcing and machine learning are relatively new approaches to this market. Our.news is developing browser extension to provide readers both publishers’ information and crowdsourcing reviews. It is not effective due to a small user group so far. InfoRoots will be an innovative business based on machine learning in this market.
Stakeholders
InfoRoots has two groups of project stakeholders: team instructors and team members. Team instructors consist of one overall supervisor and several teaching assistants. In InfoRoots’ team instructors, John Sturman is the overall supervisor. Charly Huang and Vaishnavi Neema are two teaching assistants. Their responsibility is to facilitate InfoRoots to successfully develop and launch to its market. As an experienced project manager, John Sturman offers courses on the design and development of InfoRoots to team members. Two teaching assistances provide feedback to the deliverables of InfoRoots.
In team members, there are four undergraduate students. They are Siwen Zhang, Joseph Om, Jianing Lin, and Lei Luo. Under the Scrum framework, Lei Luo functions as both the project owner and Scrum Master. All team members function as designers and programmers to develop the InfoRoots web platform.
Major Features
InfoRoots web platform is designed to investigate online articles. When online readers enter one article link on InfoRoots, they will see three major features that can help them determine whether the contents in the article are false.
The first feature is the authors’ information. It presents not only the background information of authors from Wikipedia sources but also reliability scores measured by our machine learning algorithm. The algorithm produces scores based on examining recent articles written by the authors.
The second feature is the publisher’s information. It offers the professional publisher ratings from non-partisan fact-checking organizations, such as NewsGuard. Besides, it also presents the ratings of other publishers that generate similar content. Our users can evaluate the credibility of information by comparing different publishers.
The third feature is the citation and content analysis. The analysis system pinpoints all citations in the original article and extracts relevant paragraphs from these citations. The relevant paragraphs are shown to readers when they click at each citation. As our users read through the article on InfoRoots, they can check two reliability factors. The first one is whether the cited information came from reliable publishers. The second one is whether the content in the citations is presented accurately in the original article.
Major Risks
InfoRoots has two potential risks. The first risk is related to the completion of InfoRoots. Each proposed feature requires a certain level of knowledge on machine learning and data scraping. Since all team members have little experience in developing major features mentioned above, they will spend the majority of their time exploring and researching phase. The final deliverable might contain uncompleted features. Shorter work cycles can mitigate this risk as it allows agile reviews and revises on developing features.
The second risk is that all proposed features require a lot of computation powers. To test major features, InfoRoots might spend a lot of money on subscribing to cloud computing services. If one of the major features costs expensive computing resources, the team might revise the expensive feature in order to save money. That is, three major features are highly subjected to changes.
Owner
- Name: phantomlei
- Login: phantomlei3
- Kind: user
- Repositories: 34
- Profile: https://github.com/phantomlei3
Citation (citationsNetwork.py)
import hashlib
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
from multiprocessing import Process
from PostgreSQL.database import database
from article import article
class citationsNetwork:
'''
citationsNetwork represents all citations in one article
It communicated with database mediator to obtain data and invoke scrapy crawler to extract information.
It contains all functions related to build citations network on the user interface.
'''
def __init__(self, article_id):
'''
The initiator of citationsNetwork only assigns variables from outside and create database mediator
'''
self.article_id = article_id
self.db = database()
def get(self):
'''
The main function in citationsNetwork to check if publisher information exists in citations table
And use Article class to obtain information for each citation link
:return: a json dictionary that contains:
'article_paragraphs': []
'citation_links': []
'citation_info': {'link': {'article_title', 'article_content', 'article_credibility'}}
:return None if article is not in database
'''
json_dict = dict()
# extract information from database
citation_results = self.db.lookup_citation(self.article_id)
if citation_results is not None:
article_paragraphs = citation_results[0]
citation_links = citation_results[1]
# obtain specific info for each citation link
## TODO: Advance parallel processing
## NOW: limit three citations only (for convenience)
citation_info = dict()
count = 0
for i in range(len(citation_links)):
one_info = dict()
# skip non-existed citations
if citation_links[i] == "None" or count >= 3:
citation_links[i] = "None"
continue
cited_article = article(citation_links[i])
article_result = cited_article.get()
if article_result is not None:
# create inner dict to store information for one citation
one_info['article_title'] = article_result['article_title']
one_info['article_content'] = article_result['article_content']
one_info['article_credibility'] = article_result['article_reliability']
citation_info[citation_links[i]] = one_info
count += 1
else:
# delete non-profile citation link
citation_links[i] = "None"
json_dict['article_paragraphs'] = article_paragraphs
json_dict['citation_links'] = citation_links
json_dict['citation_info'] = citation_info
return json_dict
else:
return None
GitHub Events
Total
Last Year
Dependencies
- 132 dependencies
- axios ^0.19.2
- cors ^2.8.5
- express ^4.17.1
- history ^4.10.1
- react-router-dom ^5.1.2
- zeromq ^5.2.0
- 1398 dependencies
- @iconify/icons-oi ^1.0.3
- @iconify/react ^1.1.3
- axios ^0.19.0
- bulma ^0.7.5
- classnames ^2.2.6
- jwt-decode ^2.2.0
- react ^16.9.0
- react-bulma-components ^2.3.0
- react-countdown-circle-timer 1.0.6
- react-dom ^16.9.0
- react-rater ^5.1.1
- react-redux ^7.1.1
- react-router-dom ^5.1.2
- react-scripts 3.1.1
- react-scroll-up-button ^1.6.4
- redux ^4.0.4
- redux-thunk ^2.3.0
- styled-components ^4.3.2