Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.6%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: AymenRaouf
  • License: other
  • Language: Python
  • Default Branch: main
  • Size: 33 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 2 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

ConstrucTED : Constructing Tailored Educational Datasets From Online Courses

-----------------------------------------------------

ConstrucTED is a tool built on top of Google APIs, enabling the efficient creation of custom educational datasets from YouTube playlists. It creates datasets from video course transcripts, providing a ready-to-use solution that significantly shortens the time required to create such datasets. The resulting datasets are versatile and suitable for tasks like classification and learning path creation.

-----------------------------------------------------

Installation

Download the project then use the package manager pip to install the dependencies.

bash pip install -r requirements.txt -----------------------------------------------------

Usage

  • Before using ConstrucTED, you should first get a Google API personal key.
  • Create a file called .env in the base of the project and add this line with your perosnal API key : bash GOOGLE_API_KEY='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
  • You can run the main.ipynb file to create datasets.
  • There are some pre-made input files in the Input folder. You can use these sample input files to create datasets.
  • The datasets that can be created using these sample input files are available in the Output folder for direct usage.
  • You can create your own input files and use them in the code python input_file = 'path_to_your_input_file' my_dataset.create_series(input_file) my_dataset.save(path='path_to_an output_location')
  • This code generates three files as explained in the article : series.csv, episodes.csv, and chapters.csv.
  • These files contain the created dataset.

-----------------------------------------------------

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

-----------------------------------------------------

Citation

If you use this code in your research or projects, please cite the following article:

ConstrucTED : Constructing Tailored Educational Datasets From Online Courses

Aymen Bazouzi, Zoltan Miklos, Mickaël Foursov, Hoël Le Capitaine.

Proceedings of the 16th International Conference on Computer Supported Education (CSEDU), 2024.

📖 BibTeX

bibtex @conference{ekm24, author={Aymen Bazouzi and Zoltan Miklos and Mickaël Foursov and Hoël {Le Capitaine}}, title={ConstrucTED: Constructing Tailored Educational Datasets from Online Courses}, booktitle={Proceedings of the 16th International Conference on Computer Supported Education - Volume 1: EKM}, year={2024}, pages={645-652}, publisher={SciTePress}, organization={INSTICC}, doi={10.5220/0012745000003693}, isbn={978-989-758-697-2}, issn={2184-5026}, }

License

MIT

Owner

  • Name: Aymen
  • Login: AymenRaouf
  • Kind: user
  • Location: France

PhD researcher

Citation (CITATION.cff)

```yaml
cff-version: 1.2.0
message: "If you use this software, please cite the following paper."
authors:
  - family-names: Bazouzi
    given-names: Aymen
  - family-names: Miklos
    given-names: Zoltan
  - family-names: Foursov
    given-names: Mickaël
  - family-names: Le Capitaine
    given-names: Hoël
title: "ConstrucTED : Constructing Tailored Educational Datasets From Online Courses"
conference: "Proceedings of the 16th International Conference on Computer Supported Education (CSEDU)"
year: 2024
doi: "10.5220/0012745000003693"
url: "https://www.scitepress.org/Link.aspx?doi=10.5220/0012745000003693"

GitHub Events

Total
  • Watch event: 1
  • Push event: 1
Last Year
  • Watch event: 1
  • Push event: 1

Dependencies

requirements.txt pypi
  • Pygments ==2.17.2
  • asttokens ==2.4.1
  • cachetools ==5.3.3
  • certifi ==2024.2.2
  • charset-normalizer ==3.3.2
  • comm ==0.2.1
  • debugpy ==1.8.1
  • decorator ==5.1.1
  • exceptiongroup ==1.2.0
  • executing ==2.0.1
  • google-api-core ==2.17.1
  • google-api-python-client ==2.119.0
  • google-auth ==2.28.1
  • google-auth-httplib2 ==0.2.0
  • googleapis-common-protos ==1.62.0
  • httplib2 ==0.22.0
  • idna ==3.6
  • ipykernel ==6.29.3
  • ipython ==8.22.1
  • jedi ==0.19.1
  • jupyter_client ==8.6.0
  • jupyter_core ==5.7.1
  • matplotlib-inline ==0.1.6
  • nest-asyncio ==1.6.0
  • numpy ==1.26.4
  • packaging ==23.2
  • pandas ==2.2.1
  • parso ==0.8.3
  • pexpect ==4.9.0
  • platformdirs ==4.2.0
  • prompt-toolkit ==3.0.43
  • protobuf ==4.25.3
  • psutil ==5.9.8
  • ptyprocess ==0.7.0
  • pure-eval ==0.2.2
  • pyasn1 ==0.5.1
  • pyasn1-modules ==0.3.0
  • pyparsing ==3.1.1
  • python-dateutil ==2.8.2
  • pytz ==2024.1
  • pyzmq ==25.1.2
  • requests ==2.31.0
  • rsa ==4.9
  • six ==1.16.0
  • stack-data ==0.6.3
  • tornado ==6.4
  • traitlets ==5.14.1
  • tzdata ==2024.1
  • uritemplate ==4.1.1
  • urllib3 ==2.2.1
  • wcwidth ==0.2.13
  • youtube-transcript-api ==0.6.2