Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (0.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: ritik3000
  • Language: Jupyter Notebook
  • Default Branch: master
  • Size: 54.1 MB
Statistics
  • Stars: 1
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 6 years ago · Last pushed about 5 years ago
Metadata Files
Readme Citation

README.md

researcher_recommendor

Researcher recommendor using tie,co author abstract,venue and refrences as feature.It involves social network analysis and Nlp techniques

Owner

  • Name: Ritik Dhingra
  • Login: ritik3000
  • Kind: user
  • Location: Gurgaon,India
  • Company: IIT(BHU),Varanasi

Fintech Enthusiast, Data science enthusiast

Citation (Citation_Score.py)

import pandas as pd
import time
import json
import multiprocessing
from multiprocessing import Process
import math
import pickle
from signal import signal, SIGPIPE, SIG_DFL
signal(SIGPIPE, SIG_DFL)
with open('Aut_list.txt', 'rb') as f:
    author_list = pickle.load(f)
print("Read Author_list")

df1=pd.read_json("dblp-ref-0.json",lines=True)
print("a")
df2=pd.read_json("dblp-ref-1.json",lines=True)
print("b")
df3=pd.read_json("dblp-ref-2.json",lines=True)
print("c")
df4=pd.read_json("dblp-ref-3.json",lines=True)
print("d")
frames=[df1,df2,df3,df4]
df=pd.concat(frames)
frames=[df1,df2,df3,df4]
df=pd.concat(frames)
df.dropna(inplace=True)

df=df.reset_index()


print("Read DBLP")

def thread_names(l,r):
    for author in author_list[l:r]:
        for i in range(df.shape[0]):
            if i%100000==0:
                print(l,":",r,":",i//100000)
        #print(df.iloc[i].authors)
            if author in df.iloc[i].authors:
                if author in authors_dict:
                    authors_dict[author].append(df.iloc[i].id)
                else:
                    authors_dict[author]=[df.iloc[i].id]

def multiprocessed():
    processes = []
    n = 0
    threads = 40
    for i in range(0, threads):
        stop = n + math.floor(1/threads + 1) if n + threads <= 1 else 1
        p = Process(target=thread_names, args=(n, stop))
        n = stop + 1
        processes.append(p)
    # Start the processes
    for p in processes:
        p.start()
    # Ensure all processes have finished execution
    count = 0
    for p in processes:
        p.join()
        print("Process %d is over and time taked is : %f" %(count, (time.time() - start)))
        count += 1

start = time.time()
if __name__=="__main__":
    manager = multiprocessing.Manager()
    authors_dict = manager.dict()
    multiprocessed()

print(type(authors_dict))
authors_dict=dict(authors_dict)
print(type(authors_dict))

pickle.dump( authors_dict, open( "Coauthor_Citation_score.pkl", "wb" ) )

for i in authors_dict:
    print(i)

GitHub Events

Total
Last Year