smartcite

SmartCite is a web-based citation generator that helps users create accurate APA, MLA, and Chicago citations from URLs, DOIs, or manual input with ad-free interface.

https://github.com/rajpatel2518/smartcite

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.8%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

SmartCite is a web-based citation generator that helps users create accurate APA, MLA, and Chicago citations from URLs, DOIs, or manual input with ad-free interface.

Basic Info
  • Host: GitHub
  • Owner: rajpatel2518
  • Language: HTML
  • Default Branch: main
  • Size: 12.7 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme Citation

README.md

Smartcite

SmartCite is a web-based citation generator that helps users create accurate APA, MLA, and Chicago citations from URLs, DOIs, or manual input with ad-free interface.

🛠 Requirements

  • Python 3.8 or newer
  • Required packages listed in requirements.txt

🔧 Setup Instructions

  1. Clone the repository:

bash git clone https://github.com/your-username/SmartCite.git cd SmartCite

  1. (Optional) Create a virtual environment:

bash python3 -m venv venv source venv/bin/activate

  1. Install dependencies:

bash pip install -r requirements.txt

  1. Run the application:

bash python app.py

  1. Visit in your browser:

http://127.0.0.1:5000

Owner

  • Login: rajpatel2518
  • Kind: user

Citation (citation_logic.py)

import urllib.request
import datetime
import re
import json
from bs4 import BeautifulSoup
from dateutil import parser


def citation_components(web_address):
    first_name = ""
    last_name = ""
    page_title = "Untitled Page"
    website_title = "Unknown"
    date_published = ""
    date_accessed = datetime.date.today().strftime("%B %d, %Y")

    try:
        req = urllib.request.Request(web_address, headers={"User-Agent": "Mozilla/5.0"})
        response = urllib.request.urlopen(req)
        html = response.read()
        soup = BeautifulSoup(html, 'html.parser')
    except:
        return (first_name, last_name, page_title, website_title, date_published, date_accessed)

    try:
        for script in soup.find_all("script", type="application/ld+json"):
            try:
                data = json.loads(script.string)
                if isinstance(data, dict) and data.get("@type") in ["NewsArticle", "Article"]:
                    if "author" in data:
                        author_data = data["author"]
                        if isinstance(author_data, dict):
                            author_name = author_data.get("name", "").strip()
                        elif isinstance(author_data, list):
                            author_name = author_data[0].get("name", "").strip()
                        else:
                            author_name = ""
                        if author_name:
                            name_parts = author_name.split()
                            if len(name_parts) >= 2:
                                first_name = name_parts[0]
                                last_name = name_parts[-1]
                    if "datePublished" in data:
                        parsed = parser.parse(data["datePublished"])
                        date_published = parsed.strftime("%B %d, %Y")
                    if "headline" in data:
                        page_title = data["headline"].strip()
                    break
            except:
                continue

    except:
        pass

    try:
        if not page_title or page_title == "Untitled Page":
            og_title = soup.find("meta", property="og:title")
            if og_title and og_title.has_attr("content"):
                page_title = og_title["content"].strip()
    except:
        pass

    if not page_title or page_title == "Untitled Page":
        try:
            title_tag = soup.find("title")
            if title_tag:
                page_title = title_tag.get_text().strip()
        except:
            pass

    if not page_title or page_title == "Untitled Page":
        try:
            h1_tag = soup.find("h1")
            if h1_tag:
                page_title = h1_tag.get_text().strip()
        except:
            pass

    try:
        match = re.search(r"(?:https?://)?(?:www\.)?([^/]+)", web_address)
        if match:
            website_title = match.group(1)
    except:
        pass

    try:
        if not first_name or not last_name:
            for tag in soup.find_all(["p", "div", "span"]):
                text = tag.get_text(strip=True)
                if text.lower().startswith("by "):
                    match = re.match(r"[Bb]y ([A-Z][a-z]+) ([A-Z][a-z]+)", text)
                    if match:
                        first_name = match.group(1)
                        last_name = match.group(2)
                        break
    except:
        pass

    try:
        if not first_name or not last_name:
            author_tag = soup.find(attrs={"class": re.compile(r"(author|byline).*", re.I)})
            if author_tag:
                author_name = author_tag.get_text(strip=True)
                name_parts = author_name.split()
                if len(name_parts) >= 2:
                    first_name = name_parts[0]
                    last_name = name_parts[-1]
    except:
        pass

    try:
        if not date_published:
            time_tag = soup.find("time")
            if time_tag and time_tag.has_attr("datetime"):
                parsed = parser.parse(time_tag["datetime"])
                date_published = parsed.strftime("%B %d, %Y")
            elif time_tag and time_tag.string:
                parsed = parser.parse(time_tag.string)
                date_published = parsed.strftime("%B %d, %Y")
    except:
        pass

    try:
        if not date_published:
            all_text = soup.get_text()
            match = re.search(r'\b(20\d{2}|19\d{2})\b', all_text)
            if match:
                date_published = match.group(1)
    except:
        pass

    return (first_name, last_name, page_title.strip(), website_title.strip(), date_published, date_accessed)


def apa_compile(web_address):
    first_name, last_name, page_title, website_title, date_published, _ = citation_components(web_address)

    if first_name and last_name:
        citation = f"{last_name}, {first_name[0]}."
    else:
        citation = page_title

    if date_published:
        citation += f" ({date_published}). "
    else:
        citation += " (n.d.). "

    if first_name and last_name:
        citation += f"{page_title}. "

    citation += f"{website_title}. Retrieved from {web_address}"
    return citation


def chicago_compile(web_address):
    first_name, last_name, page_title, website_title, date_published, date_accessed = citation_components(web_address)

    author = f"{first_name} {last_name}." if first_name and last_name else ""
    site = f"<i>{website_title}</i>"
    citation = f"{author} \"{page_title}.\" {site}."

    if date_published:
        citation += f" Published {date_published}."
    citation += f" Accessed {date_accessed}. {web_address}"
    return citation


def mla_compile(web_address):
    first_name, last_name, page_title, website_title, date_published, date_accessed = citation_components(web_address)
    site = f"<i>{website_title}</i>"

    author = f"{last_name}, {first_name}." if first_name and last_name else ""
    citation = f"{author} \"{page_title}.\" {site}"
    if date_published:
        citation += f", {date_published}"
    citation += f", {web_address}. Accessed {date_accessed}."
    return citation


def citation_from_doi(doi: str, style: str):
    import requests
    if doi.startswith("https://doi.org/"):
        doi = doi.replace("https://doi.org/", "")

    url = f"https://api.crossref.org/works/{doi}"
    try:
        response = requests.get(url)
        data = response.json()["message"]

        author_list = data.get("author", [])
        if author_list:
            first = author_list[0].get("given", "")
            last = author_list[0].get("family", "")
        else:
            first = last = ""

        title = data.get("title", ["Untitled"])[0]
        container = data.get("container-title", [""])[0]
        year = data.get("published-print", {}).get("date-parts", [[None]])[0][0] or "n.d."
        accessed = datetime.date.today().strftime("%d %b. %Y")
        link = f"https://doi.org/{doi}"

        if style == "apa":
            if first and last:
                return f"{last}, {first[0]}. ({year}). {title}. {container}. {link}"
            else:
                return f"{title} ({year}). {container}. {link}"

        elif style == "chicago":
            author = f"{first} {last}." if first and last else ""
            return f"{author} \"{title}.\" <i>{container}</i>. Published {year}. Accessed {accessed}. {link}"

        elif style == "mla":
            author = f"{last}, {first}." if first and last else ""
            return f"{author} \"{title}.\" <i>{container}</i>, {year}, {link}. Accessed {accessed}."

    except Exception as e:
        return f"Error generating citation from DOI: {e}"

GitHub Events

Total
  • Push event: 8
  • Fork event: 1
  • Create event: 2
Last Year
  • Push event: 8
  • Fork event: 1
  • Create event: 2