https://github.com/ctuavastlab/url2mill.jl
An extension of Mill.jl to convert URLs to Mill structure
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (3.7%) to scientific vocabulary
Repository
An extension of Mill.jl to convert URLs to Mill structure
Basic Info
- Host: GitHub
- Owner: CTUAvastLab
- Language: Julia
- Default Branch: main
- Size: 17.6 KB
Statistics
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Url2Mill.jl
An extension of Mill.jl to convert URLs to Mill structure
A simple library implementing representation of URLs from the paper Nested Multiple Instance Learning in Modelling of HTTP network traffic, Tomas Pevny, Marek Dedic, 2020
Example: ```julia using Url2Mill
julia> ds = url2mill("st.360buyimg.com/m/css/2014/index/home20175_9.css?v=jd201705182030") ProductNode # 1 obs, 152 bytes ├── hostname: BagNode # 1 obs, 104 bytes │ ╰── ArrayNode(2053×3 NGramMatrix with Int64 elements) # 3 obs, 166 bytes ├────── path: BagNode # 1 obs, 104 bytes │ ╰── ArrayNode(2053×5 NGramMatrix with Int64 elements) # 5 obs, 214 bytes ╰───── query: BagNode # 1 obs, 136 bytes ╰── ProductNode # 1 obs, 64 bytes ├──── key: ArrayNode(2053×1 NGramMatrix with Int64 elements) # 1 o ⋯ ╰── value: ArrayNode(2053×1 NGramMatrix with Int64 elements) # 1 o ⋯
If you want to represent strings by ngrams directly as `SparseArrays`, use `use_sparse_arrays = true`
julia
julia> ds = url2mill("st.360buyimg.com/m/css/2014/index/home201759.css?v=jd201705182030";usesparse_arrays = true)
ProductNode # 1 obs, 184 bytes
├── hostname: BagNode # 1 obs, 112 bytes
│ ╰── ArrayNode(2053×3 SparseMatrixCSC with Int64 elements) # 3 obs, 552 b ⋯
├────── path: BagNode # 1 obs, 112 bytes
│ ╰── ArrayNode(2053×5 SparseMatrixCSC with Int64 elements) # 5 obs, 888 b ⋯
╰───── query: BagNode # 1 obs, 152 bytes
╰── ProductNode # 1 obs, 80 bytes
├──── key: ArrayNode(2053×1 SparseMatrixCSC with Int64 elements) # ⋯
╰── value: ArrayNode(2053×1 SparseMatrixCSC with Int64 elements) # ⋯
```
Owner
- Name: Joint research lab of Czech Technical University in Prague and Avast
- Login: CTUAvastLab
- Kind: organization
- Location: Prague
- Repositories: 4
- Profile: https://github.com/CTUAvastLab
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0